I’m not taking on the Alt.NET world
I’m not taking on the Alt.NET world because I value their contribution. I’m not trying to start any battles or wars. At the same time, I think things need to be said about the attitudes and actions of that community to keep things in perspective and foster more effective dialog across the entire industry.
The synergy of the greater Alt.NET world has led to a maturity and cohesion in tools that is very impressive. And I continue to learn from that space as from an array of other spaces. I value that.
I think the relationships of individuals within the Alt.NET community are an amazing discordant symphony that is pushing ideas forward at an astounding pace. My occasional brief observation – most often occurring in hotel bars – are that ideas float and are shot down a dozen times slowly morphing through the head of another amazing individual. Eventually some of the ideas gain traction and someone writes some code or otherwise formalizes it. Formalization allows formalized criticism and therefore improvement. This is really cool. It does not, however, look like any fun, words like stupid and misguided and childish fly around at an amazing pace. The negativity and intensity that makes it just short of a rugby match without teams. But the output is amazing in quantity and quality of thought.
The movements that Alt.NET champion have helped shape our actions and conversation. I have the opportunity to interact with hundreds of programmers a year and I have a sense of what they are doing. Continuous integration is making real inroads into how groups work and formal “integration” phases built into schedules are becoming as rare as the Prebles’ Jumping Mouse.
But this, even along with the fact the surviving results work for the practitioners involved, does not survive the corollary logic that it is the best development strategy for all, or even most shops. Cohesion and maturity do not define the best approach for the vast numbers of programmers that make up this industry. That’s why the good thing is that Microsoft did not blindly follow the pattern that worked for the relatively small Alt.NET community when developing Entity Framework. Entity Framework is a far broader initiative and EF must work in scenarios where the other pieces of Alt.NET style development are not in place (BDD, behavior based objects, test first development, etc).
If the Alt.NET ideas are the whole answer, why isn’t everyone using that approach? If it’s because everyone hasn’t personally been indoctrinated by working for months on an Alt.NET project, as I understood Scott Bellware to be implying about me in a recent comment on my blog, then Entity Framework cannot succeed regardless of the perfection of the tool. If you have to go be personally instructed, you can no more be personally instructed in EF than in NHibernate.
Entity Framework should not block any technique, including agile, additional infrastructure, code generation, rules engines, workflow, SOA, dynamic user interfaces, as the top of my head list. But neither should it be built in the vision of one existing – and therefore outdated – approach to software development. The change in terminology from TDD to BDD illustrates how fast thinking within the Alt.NET community changes and Entity Framework cannot chase these changes must but blaze its own trail based on the best thinking in every community.
I am not defending EF v1. It has issues. That’s not the point. The point is that the Alt.NET community is not a group of priests or gurus* that understand what the rest of us poor peasants need. Not for EF, not for broader development. I don’t believe in buying what other people tell you, thus what I value is not the “market” with its top down stories – in this case not Microsoft’s strategy outright. I believe in the intelligence of an industry evolving together – strictly the evolution of memes where continuous integration wins and behavior driven objects haven’t. Bottom up intelligence. What works in the trenches of Cleveland or Huntsville or Cut ‘n Shoot or Miami or Fairbanks. That’s what I want Microsoft to prospect and drill for. That’s what I want the Entity Framework team to become impassioned about uncovering.
Finding bottom up intelligence is an interesting ballet because it requires leadership, test products and the ultimate programmer in the trench getting out an application. Alt.NET actually represents all three of those things, so as a self-reinforcing environment it is interesting and important. I think anyone not paying attention to the ideals espoused is missing an opportunity. But the techniques and tools coming out of Alt.NET must meet the much higher bar of relevancy to the broader programming industry and by in large they have not made that cut (and incidentally, neither have my quite different ideas).
The interesting question – for all the good ideas that the broad industry does not embrace – is why? That’s the question I wish we were discussing. But that conversation is nearly impossible to have if predicated on the assumption that one set of ideals (agile is the current fashion) is considered to have more value than others. Meaningful conversations can’t happen until the visible presence of the Alt.NET community appears much more willing to listen and much less intent on its own internal modus operandi of shooting down ideas simply to see what can survive the onslaught.
*The priest guru thing has not been stated, but I’ll be happy to stand up and help squash implications of superiority in any group or individual within this industry. From Alt.NET “We are a self-organizing, ad-hoc community of developers bound by a desire to improve ourselves, challenge assumptions, and help each other pursue excellence in the practice of software development.” But the most important word in this declaration is the first “a”. There are many, many, many programmers outside this community that share these espoused ideals and that are successful in pursuing other strategies to obtain the only real goal – saner, faster development of quality software to meet business needs.
I feel I need to respond to the “Vote of No Confidence” on the Entity Framework.
I have little interest in petitions. They are by nature backwards looking. To get a group of people to sign onto something they have to either understand it or be driven by the charisma of the leaders. In this case, I assume the first. The contents of the petition must be stable and old enough that everyone has worked out the details. That’s the case with all the technical petitions I can think of, although admittedly that’s just a handful like the VB6 petition.
When it comes to the appalling scenario where we have at least 14 major categories of data access strategies in use in new projects today, we need the Microsoft teams to look forward and be creative in combining the best set of techniques – NOT pick one of the existing strategies and latch on to it because it came from the group that yelled the loudest.
Entity approaches are good because they better separate the business and data sides of our middle tiers. But they are also inherently difficult and inaccessible to most programmers. Entity Framework’s goal must be to bridge this gap. That means being extremely creative in picking its battles to reach toward the real world developer – not copy a strategy that is available to that developer today and fails (the combination of NHibernate and other tools used in a specific style of development). The failure is not because NHibernate is an Open Source tool. It’s not because people don’t know about it. If it worked in the majority of shops it would burn through our industry like wildfire. Why don’t you use them? Because they do not fit your development environment! This is not an easy problem that someone’s solved and Microsoft is looking the other way. It’s an incredibly hard problem – how do I know? I’ve been working on occasionally novel solutions to the problems for 20 years.
Entity Framework has issues. This is not news. It’s not even news to the Entity Framework team.
- EF is not a failure because it doesn’t fit TDD development
- EF is not a failure because business logic goes into partial classes
- EF is not a failure because it treats data as an important part of biz objects
- EF is not a failure because it accepts that most people do data first development
- EF is not a failure because lazy loading is hard – lazy loading can destroy performance
- EF is not a failure because its design tools are 1.0 level
- EF is not a failure because it has a poor strategy for merging into source control
All of these are potentially issues, but it’s critical, essential, I cannot yell this loud enough – Entity Frameworks must not be designed for the group that is best organized and screams the loudest. This already happened once with the disastrous IPOCO attempt that helped no one and wasted a lot of manpower that could have improved mapping and provided better metadata.
But then I’m sort of caught in a corner, because an important point of the petition is correct. Be cautions with EF. Do not jump into Entity Framework because of Microsoft marketing. It’s a tough platform that will get a little easier when the current spasm of books comes out. The niche is pretty narrow and if you step off the boards, the quicksand can be pretty deep. Treat it like what it is - an amazingly large and complex project that is being released as a 1.0 product. It’s an infant. The metadata and mapping still stink. Look at it as Microsoft’s current future direction, but remember how many current future directions we’ve had over the last 15 years (around ten) and remain skeptical.
AdventureWorksLT might look like a good database to get started with if you want to look at the GenDotNet tools (they are still in a very alpha state), but its not. The reason is the use of multi-part primary keys. I don’t know how much to support these in the tool, so for now they just aren’t supported.
The underlying question is whether your many-to-many tables should have their own primary key and use the parent primary keys as logical keys, or whether the parent primary keys should be combined into a multi-part primary key. Because my many-to-many tables often have, or evolve into having a payload, I use the first approach. The second becomes relatively complex at a number of points in the generation process.
If you feel multi-column primary keys are tremendously important or valuable, I'd love to know why.
For now, multi-column primary keys and tables without primary keys are simply not supported.
Yes, amazing as it sounds, I probably need to teach you how to use Object Browser.
I know, I know, you think you know. Open a project hit F2 or whatever key you have mapped and voila you see Object Browser.
Crippled.
To use Object Browser in a non-crippled state, you need to do some things that are, let’s pick a word, bizarre, surprising?
Open a new instance of Visual Studio 2008. Leave your old one open or close it, it doesn’t really matter. Open Object Browser in this empty instance of VS and you’ll find a combo box in the upper left. Drop it down and select “Edit Custom Component Set.” Browse to the location that where you’re building your startup project (It should contain the appropriate assemblies) and select all the assemblies for your project.
That’s right. Don’t open the solution in Visual Studio, but open an empty Visual Studio and select all the assemblies of your project into a custom component set.
Now, select a class that is interesting and you’ll notice not only a node for base types, but also one for derived types. It takes a minute to search a large solution, but you can see the basic relationships from within Object Browser.
Cool.
The empty Visual Studio was certainly not obvious to me, but thanks to Bill McCarthy and DJ Park for showing me the light.
Now, here’s your part. I have a Connect issue to fix this to work inside your solution that was closed. I’m not sure how loud to scream about this in VNext. Does it matter to you that this work inside your solution? How important relative to other features that have floated such as the Call Hierarchy?
I think it’s important to differentiate between a likable user interface and a good one. A user interface can be likable and bad. It can be good and not likable.
I got a comment from my last post that said because someone liked the UI it was good. I disagree. A good UI supports you in all actions, at all phases of learning the program. Because it happens to fit the particular set of features you use now and the state in learning you are in now does not, ever, make an interface a good interface.
And I did not mean to imply that the Office 2007 user interface was entirely without merit. It attempts to address the fact we’d outgrown graphic toolbuttons, and toolbars as the sole organizational item. Those of us that could organize our toolbars pretty much liked them. Other users were often stuck with menus.
But menus have been the backbone of all of our learning programs and, as I understand it, screen readers. Bill McCarthy has a great post with an exercise to show you how deeply screwed up Office 2007 is to the Windows reader. He compares it to Notepad. If someone can compare to Office XP and post, I’ll be happy to link. Accessibility is important on two levels – it certainly retains its focused on people that can be enabled or disabled by specifics of the world around them.
If we are average in sight and dexterity, we expect computers to react to our own level of ability – we expect that the mouse adjust to our dexterity, the keyboard sized to our finger reach, and the font to be a readable size. If we are slightly off average, we compensate with larger fonts or a large trackball. If we fall outside the anticipated norms, we fall off a cliff, as shown in Bill’s exercise. Computers have the capacity to be a more level playing field- to extend enabling further.
And the second level at which accessibility is important is during the lifetime of people with average abilities. Almost all of us will have limited ability when we are young and old. Many of us will pass through stages of temporary limited ability. A few years ago when my arm got wrecked I had my mom (a programmer in her own right) reformat my code. I could write the code, but I couldn’t quite handle arrangng the declarations with one parameter per line the way I wanted to deliver.
More attention needs to be paid to making computers physically easier to use. It will also save some people from life changing RSI injuries.
Which was one point of the post.
The other was that much as the Office 2007 user interface sucks, I do not think it is beyond redemption.
- Add back the menus as an option and the default appearance when you start
- Raise the logo to a button and have a timed “look here” toolbar/arrow (although I’d really prefer to see those things in a ribbon page)
- Have the Alt popups also include short cut keys (Ctl-I, Ctl-B) so these are discoverable
- Fix accessibility, which might be a fully separate user interface for screen readers
- Allow the font on the ribbon to be changed
- Have the ribbon bar learn my habits. Do not collapse Word Count which I use daily.
- Re-prioritize the Home ribbon page so the ugly and rarely useful styles take up ½ the screen width and Cut/Copy, and all the very common formatting items remain small icons
- Allow me to promote commonly used items to the ribbon. I use Paste/Special/Unformatted all the time and its buried
- Rethink the ribbon in light of the wasted space on each side of the document on a standard monitor. Pull things like styles back into sidebars. Use this space with abandon. Consider allowing me to move ribbon pages there, or at the very least create a magnificently large and beautiful set of shortcuts.
- Fix the Quick Access toolbar. It’s a good idea gone bad. It’s as far as you can get onscreen from where I’m usually typing, there are not displayed shortcuts, and it uses small icons. And absolute ton of work could be done here.
And those are the thoughts of someone not a professional UI person. If the Ribbon solved accessibility, used space effectively, morphed automatically to how I work, it does have potential.
But right now, it sucks.
I have a hundred (ok dozen) finished not quite ready blog posts. Except it’s hard to finish them because I don’t particularly like saying hard things. Negative things. Some of which will be brutal to people I have enormous respect for and consider friends.
I’ll get back to the technical things. It’s just the code gen stuff has been evolving at a background level in real projects and I need to work out verbalizing the core, the best practices of the details. And, my column takes up some of my Tips and Tricks type stuff.
This is one of a series of passion posts – posts about how deeply screwed up our industry is becoming because we are tied to Microsoft and they are becoming rather screwed up.
Let me start with someone I don’t know, that way it’s easier for me.
Redmond Developer published a cover story regarding Steven Sinofsky replacing Jim Allchin as head of the Windows team. That’s cool. Let Jim move on to whatever pleases him. Maybe he’ll have time for lunch as he’s second on my list of people I’d like to meet (and no, neither Gates nor Ballmer is first).
The entire article was about Sinofsky “holding his cards close” - meaning we don’t know what’s coming in Windows 7. First, I agree completely with the article. We should rise in revolt at any attempt to remove transparency from the Microsoft development cycle.
Microsoft wishes to believe it is just a company making money. It is not. Perceptions to that affect are bad for us and bad for Microsoft stockholders. Microsoft’s job is not to create great new products. Its job is to lead an ecosystem with great new products. Out here, that’s not a subtle difference. Microsoft decisions affect trillions of dollars in investment and assets for companies around the world. They have an absolute obligation to be transparent and if they do not continue to lead through transparency – which is the only way to ride the tornado they created – we must call the bluff that we have no options. We are certainly not there now, but we as the consumers must not bow down to the vision that we have no choices. As individuals today, we may have little choice. But as an industry, we can create choice.
It is imperative that Microsoft lead by transparency.
But the article ignores the elephant in the room. Sinofsky came from Office – unless my timeline is warped – he brought you Office 2007. Let me choose the nicest words possible: Office 2007 is an abomination. Do not believe for one second that this was an attempt to make your life easier. Do not believe that any honest usability lab could have been shown this UI to be useful to you. And for reference I use it and have for over a year, so this is not a comment on the initial shock factor. Nor will I waste your time with the stupidities in the interface. Let’s jump to why.
It was an attempt to protect Office - at your expense. Open Office is pretty damn good. It’s run on most of the computers in my home. We exchange documents with Office on a regular basis. There is no true force keeping people on Office for the vast majority of document and spreadsheet creation. Microsoft knows this. So it created a user interface that it believes it can protect. If you don’t believe me, look at what you have to sign to use the interface.
Ha! Let’s put the features you’re guaranteed to use every day under a logo that does not look like a button. Mom becomes the geek of her retirement community whispering that secret. Let’s backtrack on accessibility – don’t let them change font size. Let’s have the only discoverable way to make italics be Alt-H, Alt-2. OK, we’ll leave Ctl-I but to discover it (previously in the menu caption) make them type Alt-F, Alt-I (you can’t tell if that’s a 1 or the cap letter I on screen either), six down arrows, enter, Alt-t, one down arrow (assuming you know which tab Italics is on), tab, two down arrows. (Think you’ll never have a stroke or skiing accident?) Oh, and does any of this work right to left yet? Full stop. I could go blog post after blog post on what’s wrong with the Office UI. This UI was created to be different and protectable, not better for you.
The core issue is that Microsoft put the person in charge of the most extreme shift to controlling the ecosystem since Lotus and AmTrak fought over the sliding bar interface (if you don’t get that joke, never mind). Protection of a grossly overpriced Office superceded the good of the ecosystem. And the person in charge of that mess is now running Windows. You worried? Add in that he is being allowed to reverse a companywide shift to transparency apparently started by Ballmer himself eight years ago.
The shift to transparency is couched in the Karl Rovian phrase “translucency” meaning secrecy.
Transparency means “my people will not be afraid to talk to your people.” I’ve only seen two fallouts from transparency: Insider’s groups whine a little about not having much warning ahead of the public (It is beautifully short, often at zero) and people being disappointed when Microsoft backtracked, particularly on Vista, previously known as Longhorn.
Hey, I was there. I drove through fire to get to the Longhorn PDC and ice to get home. Microsoft got explicit feedback (from me and many others) that it was too grandiose a plan “even if you could do it, which you can’t, we can’t uptake it.” The shrinking of Longhorn was the right thing to do and the stupid thing was that they said they could do it all. You don’t fix that by removing transparency. In fact the transparency around Longhorn was important - there was a lot of feedback regarding which features had marginal value like the whole let’s do the file system in SQL so we can organize our photographs stuff.
Translucency means “I will control what you know.” It means things aren’t public and there isn’t a route for you to give feedback until a beta stage – in case you haven’t noticed, betas happen at feature complete meaning your input is explicitly excluded in terms of shaping the product. It means people can’t talk. And, it is not the solution for the issues given. Just look at Charlie Calvert or Paul Vick’s blog. They don’t say “This is what we’re going to do.” They say “this is what we’re thinking about.” It’s transparency with honesty and realism. And it gives you an opportunity to shape the product in public discussions. It wasn’t transparency that led to Vista disappointment; it was a lack of honest assessment of reality.
This is getting long, but I want to answer the 89 people that have already started writing comments that Microsoft is a company and is in the business to make money. That is totally true. But, as the keeper of the ecosystem, they make money by managing the ecosystem to the ecosystem’s advantage. They cannot help but make money if the Microsoft ecosystem is healthy. They will wither as a natural result if the ecosystem is damaged, no matter how they contrive to exploit the dying ecosystem. It cannot be any other way. Trying to protect the ridiculous price of Office with a unique but terrible UI metaphor to perpetuate the myth that people must use Office is a disservice to shareholders. Hiding the future of Windows 7 prohibits us giving the feedback. Here it is: the most important thing in Windows 7 is to get Vista right: fix the driver issues for legacy hardware, improve performance, fix a few annoying bugs like that stupid toast the details layout, keep up the improvements in security. Hmm. That’s about it. Market it showing off the cool features its already has and only throw in new things that work really well.
The most valuable thing Microsoft could do for its future position and the ecosystem would be a commitment that Vista will be compatible with all existing hardware and to write the drivers themselves if necessary. Twenty five years ago, in the midst of the Xerox PC debacle, a tiny little company in Houston took on IBM by promising if software didn’t run on its OS, they’d fix it. Guess who? That’s a powerful promise. If Microsoft can’t make that promise on drivers, they screwed up and need to fix it in Windows 7.
Either that or they need to plan a decade long strategy for uptake, including ongoing availability and upgrades to XP.
If the Windows team was transparent, showing us flashy features, our answer would be “for god’s sake, just make Vista work well”.
This is your ecosystem. Don’t stick your head in the sand. Pay attention to what’s going on. Talk about it. Scream about it.
---
PS. I don't know how widely used the "elephant in the room" metaphor is. It means we have something so big we can't be unaware of it, but at the same time, we're avoiding talking about it.
The new tools need a new UI. The old one is hopelessly mired in ancient ways of doing things (isn’t that astounding to say about .NET code). To focus on the core features, the UI took a back seat with a temporary UI with lots of shortcuts. Several of these shortcuts were just wrong…
One is to organize by template type (core object, child list, readonly object, readonly list, hieararchy) and within this organization include files for editable and handcrafted files. I haven’t converted this project (and may not) to use partial methods, so I’ve got a three class strategy – base, derived designer, derived handcrafted. I have overwritten the files in the editable about 65 times because of this organization.
Which brings me to the second shortcut we should never have taken: This version is the first harness I’ve used in over six years that overwrites handcrafted files. I have no clue how you folks that use most existing tools survive. I’m just dying under the rewrites and become totally dependent on source code recovery – something I almost never do with normal code.
Unless you've got a good reason to do it otherwise, organize as Project/Generation Group/Template Group NOT Project/Template Group/Generation Group. IOW, use Project/Autogenerated/RootObject/files and Project/Editable/RootObject/files NOT Project/Root Object/Autogenerated/files and Project/RootObject/Editable/files. This allows easier file management such as deleting directories and managing source control.
I'm not planning on fixing this in the temporary GenDotNet CTP user interface because I hope to replace that UI with a configurable one that gives you full control over your project structure.
OK, I’m blogging again. Why the silence? The last month was tremendously painful as I tried to keep up with clients, train a puppy, get slides ready for DevConnections next month, and tackle getting the first GenDotNet tools CTP out. The last proved to be about an order of magnitude more difficult than I expected. As such, I need to set expectations for this CTP.
I’m not sure how interesting the last is, except to say it’s amazing the things new sets of eyes see. Particularly with a large project where it’s tough to get a toe hold. And it is big. There are over 1000 files, over two thirds of which are legacy stuff temporarily in play.
This is a CTP – for the things that it focuses on, it’s beyond the proof of concept that I’ve put in my blog. But for many, many things, it’s duct tape and baling wire to get something standing up to test. I’ve written on this in the Codeplex/GenDotNet opening page, which I think is a more appropriate place.
To be honest, I wouldn’t have gone public if CodePlex didn’t demand it. But then it was probably a good thing for me and the project to have a hard deadline. The project is now up and ready for people to shoot at it and contribute. Just don’t shoot at it because its grossly incomplete. Shoot at it because of the parts I focused on – which I’ll talk about more in the next few days.
Look for far more usable drops in the next two months as I and other people have a chance to focus on bringing our vision for other portions of the project to fruition. I'm really excited about where we're going, but can't help but contemplate the image of my son wrinkled and dirty when he was first born to know heading off for graduate study at one of the finest math departments in the world (he hasn't made a decision yet on which of them). This is breathing. It's started, and it will grow. And I promise, it will not take 21 years to come into its own.
And, the puppy is beautiful and very, very smart. Photos bound to appear soon.
I haven't blogged for a while because I am up to my ears in the CodePlex project and a client project. I look forward to getting back to it.
In the meantime, I got an email from Eric Smith that CodeSmith has an open position in the Dallas/Fort Worth area. If you're interested in making a difference in the future of code generation by working on this tool, check out http://www.linkedin.com/e/vjb/493824/.
I brought my new puppy home from Nebraska yesterday.
A sweet little eight week old girl (edited: I have no idea what planet I was on when I mistyped this initially). I've had a hard time finding the right combination of home raised and decently bred Springer Spaniel. I finally found a lovely little girl 3 hours away, but it was worth every minute of the drive to find a puppy that is beautiful and adventuresome. Thanks to my children's stepmom Helen who helped with the drive. I'm not sure quite how we avoided bringing home her sister as well. Five families, including Helen's, will help me co-puppy parent when I travel.
I haven't given her a permanent name, but her puppy name is Dash. That made a soft spot in my heart as Aspen the dog I lost after 14 years together had a puppy name of Dot. Both named for markings on their forehead. Otherwise they are different. Dash is a tricolor with slight tan eyebrows and a fairly short nose.
Expect my code to come back up to snuff after I train her to be as good at listening to my .NET design problems as Aspen was. Right now she tends to sleep through my questions.
I loaned my mom my camera when she went to Thailand, and now I'm going to have to nag her and get it back - so sorry, no pictures until I borrow one.
Thanks for letting me share that this is a very special time for me. Pictures coming soon.
Karl on WPF posted this on code generators.
Karl appears to be largely talking about issues within existing MS code generation tools. The underlying problem, is that Microsoft does not have a code generation group that understands code generation from an application perspective and acts as a core resource to demand standard metadata, force Visual Studio support, explore problems like the attribute issue Karl raises and a host of other issues. There’s certainly a team around the CodeDOM, but it is entirely inappropriate for our use. And there’s one around DSL, but it’s not language neutral and it’s not incorporated by teams.
For the seven years I’ve been doing code generation and talking about it outside Microsoft I have never found a single person inside Microsoft that understands code generation the way Karl, Steele, Miguel, and the dozens of other people that actually fight through using code generation understand it. While I think I’ve had a few important ideas, the overwhelming majority of what I’ve thought and written is the same thing almost anyone who spent as much time as I have with code generating real applications would come to. This is NOT magic; it’s NOT core research anymore. It’s a well known discipline with well known problems and it’s about jolly time Microsoft stopped playing at the edges and got serious about it.
Metadata is my second core principal and the lack of standard metadata that Karl talks about is a real problem. If I a genie gave me three wishes, I would stop world hunger, turn half the teachers in elementary schools into men, and poof! create a rich robust standard metadata structure. OK, maybe I’d pick my priorities a little different if the genie actually shows up.
We sort of have standard metadata now. The problem is its too wimpy and it was designed without an extensibility model. The standard metadata is the Entity Framework edmx, and it’s a huge step. The problem is the combination of wimpiness and lack of extensibility model. As soon as we add something simple like plurals, you and I do it differently and the metadata is no longer standard.
You and I will have some extreme side cases, but 99% of our metadata will be the same content and must use the same approach. For example, Karl wants to put the “pull-down” data in the table (the high school graduated from in the college student table). I think he’s wrong. Foreign keys are always combo boxes or another lookup style and the primary key table (the high school table) should be responsible for telling the outside world what it needs to display:
· I’m a static lookup – you can cache me
· The short name for each row is in this field
· The long name for each row is in these fields concatenated like this
· I have about x records so you know what style of UI makes sense to you
· I can be filtered on these fields (optional- country/region problem)
This illustrates that metadata always looks simpler at first blush than when you spend years with it. But there are a dozen people in this country that could come together and crate a metadata standard that we could live with for a very long time. And it wouldn’t look like the gobbledygook that OMG created. Simple clear rich metadata as the bar we all reach and metadata would never be the same. Tools would quickly arise to fill that structure - including UI tools and storage models for the parts of it you can't extract from databases. The only way this can happen is for Microsoft to get serious about the problem, call a summit, then be willing for the result to differ markedly from anything they currently have leaving a lot of work for all of us to do.
As a side effect, this would open the door for model driven (as opposed to database driven) design.We're using databases as our prime source because a) we know how to make them and b) it's the only structure we can guarantee will have meaning in five or ten years. No metadata structure today has that guarantee. Standard metadata would remove that major hurdle in improving the overall we create applications. Code generation is one little cog - but since its broken the whole machine doesn't work. Fixing it will expose the next cog that's broken, but there is a sane way to develop applications and writing 2 million lines, or even 10,000 lines of semi-static code is not it.
In the meantime, look for the GenDotNet generator to isolate metadata in a separate pluggable layer to allow use with a variety of metadata formats. This insulates you from many types of metadata changes. It's the best we can do today.
Hopefully I’ve finally found a puppy and I am desperately trying to tie up loose ends to drive to Ogallala Nebraska so I’m putting this post up without my usual edits. Please let me know if I garbled any of it.
Bill McCarthy added a comment to my blog which I wanted to answer:
So why not use VB for the templates but C# for the initial output rather than some "Nearly VB" . Doesn't C# address every issue you've raised ?
But I am curious as to what about issues that are language specific, such as declarative event wiring, optional parameters etc ?
C# fixes the majority of the issues I raised, except ambiguity in closing brackets. If you assume that the closing of a structure will always be at the same level and outside embedded expressions such that you maintain symmetry in relation to the evaluation stack, you can resolve the closing brackets. Retaining symmetry in closing brackets means that the following will work:
Return _
<code>
if (x == 1)
{
<%= MoreStuffFunction() %>
}
</code>.Value
Any variation of the following will not work:
Return _
<code>
if (x == 1)
{
<%= MoreStuffFunction() %>
<%= "}" %>
</code>.Value
Which means among other things you cannot do:
Return _
<code>
if (x == 1)
{
<%= StuffFunction() %>
<%= If(z, "}", MoreStuffFunction() & "}") %>
</code>.Value
But you can rearrange it to:
Return _
<code>
if (x == 1)
{
<%= StuffFunction() %>
<%= If(z, String.Empty, MoreStuffFunction()) %>
}
</code>.Value
Bill believes this symmetry restriction is less onerous than the restrictions I placed on VB, especially the open/close parentheses on method calls. Another significant value to the C# first approach is that it’s much easier to recognize equals comparisons in assignment statements, and some of the null comparison problems I’m currently ignoring will be lessened because C# does not allow certain comparisons with nullable that VB allows.
While Bill convinced me that the C# first template was not nearly as difficult as I imagined it, by convincing me the restriction on the location of the close brackets in symmetry with the open was reasonable. However, he didn’t convince me to change my current work. VB first is the best scenario for my current client and I think if we have the possibility we should try to supply both so people can write and maintain their templates with the output code that they prefer, and prefer to debug the first version of the output in. Hopefully this can happen, but the most important thing to me at the moment is getting a working version out to you to play with – I don’t want to derail that with a second template converter/preprocessor. If someone else wants to work on that…J let me know.
The templates I’ve been talking about require very specific language features of the VB compiler and language neutral templates do not allow any ambiguity in the code output in the initial template.
The template itself must be in VB because it’s required for embedded XML – the code blocks. The code blocks are essential for understanding which code to translate when creating an alternate language template in a pre-processor. Code in strings would be impossible (or nearly so) to translate at the template level and translating at the output level would have many issues including debugging and performance. There are tools available that translate normal source code, and you could do that, but I’m not sure why. It’s a lot of extra variables, when translating the template offers faster performance and more reproducible results. Sticking with template translation - the code block clearly indicates to the template preprocessor where to switch into translation mode.
The language output by the initial template must be VB, or “nearly VB.” Even if your primary interest is C#, a language neutral solution requires that the initial template have no ambiguity. Sticking with familiar and well supported languages is helpful because the initial output can be tested in VB, isolating problems in the template from any in the template translator/precompiler. This requires a non-ambiguous language I’ll call “Nearly VB”. If you’re strictly interested in C#, and have no interest in language neutrality, you can, of course, use VB’s XML literal code blocks to directly output C# code.
Ambiguity breaks the ability to build language neutral templates because the preprocessor has very little idea of the current context. It cannot understand whether a particular close curly bracket is an End If, a Next, an End Get or something else. Unfortunately, Visual Basic is not totally ambiguity free either, which forces the concept of “Nearly VB” rather than just normal VB. Nearly VB has one syntax change and a couple of extra rules when compared to VB.
VB is ambiguous on parentheses. It uses parentheses to include both method parameters and indices. VB is also ambiguous when it comes to case. To solve this in templates, use square brackets to indicate indices and parentheses for normal method calls. The C# compiler will help you find the problems when your C# output files fail to compile. The VB output can easily replace the square brackets with parentheses when outputting VB files.
At the moment I’m not convinced that the other meaning of square brackets – allowing identifiers to match keywords – need to be supported very well. There aren’t that many keywords and simply avoiding them seems an easier solution. You can support them if you escape the character via the \x20 escape pattern and the ASCII character (/x28). OK, that’s not very pretty, so a shorter escape sequence may make sense if people run into this very often.
Case insensitive is really another way to say “case ambiguous”. Language neutral templates require that you correctly case all symbols, the preprocessor can manage the keywords it’s translating, but you’ve got to get the symbols correct. Consider a Symbols class with constants, which also provides Intellisense while you’re creating your templates.
VB is sloppy in not forcing you to include the open/close brackets after a call to a method that does not have parameters. In a broader perspective this is ambiguous because in C# the presence or absence of the parentheses indicate whether you want to call the method or grab the delegate. While that particular ambiguity is resolvable because VB would require the AddressOf operator (or a lambda expression), I’m not tracking symbols. So I don’t know whether your symbol is a method, variable, or property. Thus, I don’t know whether the parentheses is needed. For language neutral templates, you add the parentheses on all method calls.
NOTE: I actually explored whether this problem is solvable, and I believe it is not. I don’t think it’s that much to ask you to include the parentheses correctly – it’s just a place we VB coders have historically been lazy.
So, to allow language neutral templates:
- Use basic VB syntax
- Use square brackets instead of parentheses for indexes
- Maintain consistent case for all symbols
- Include open/close parentheses for all method calls
- Avoid keywords as symbol names or escape the surrounding brackets with the XML escape sequence
- Rather obviously, avoid features unique to VB
I’ll do another post in a few weeks on issues around spots the two languages inherently work differently. There will be more items on this list, particularly around the management of nulls in relation to operators.
I do not dream that I’ve covered everything. The only way to ensure language neutral templates is to create them, ensure the code is are syntax correct, compile and run in VB and then create the similar code in C# and make sure you valid syntax, clean compile, and can run the finished applications. After the upcoming preprocessor has been out for a while we will have a better idea how you can break it and *** the holes where you can. But issues that involve ambiguity will have to be solved by the template author.
Mike asks:
Just curious if your metadata also contains validation rules or not? Things like property is required or range of valid values.
It could include them in three possible ways – it currently uses one and I’ve had two others working in the past that may be resurrected.
The metadata that the database inherently knows is automatically transferred - this would be nulls and string length. How well nulls are handled is up to the architecture, but the metadata definitely knows what's nullable.
I've experimented with two additional approaches: using extended properties and parsing the TSQL of check constraints. The first would work for simple ranges and other predictable data sets, but it puts information in an unexpected place. I currently can’t justify it over placing validation in known places in the handcrafted code.
Using check constraints leverages existing information so is a "good" thing. Unfortunately, no one ever seemed to care about the months of work I put into that five years ago so I let it stagnate. Since I know more now, I could resurrect that work, but honestly I don't think I'll get time soon.
The problem is that most people just don't put check constraints in the database very often. I find that unfortunate for many reasons, but it becomes a chicken and the egg problem. People don’t put the constraints in the database because they’ll have to restate them in the business layer for decent usability. This initiative doesn’t get attention to solve that problem, because the check constraints aren’t already there. Perhaps the time is ripe now. I would love to include check constraint based validation in the Open Source version that we plan to start up on Code Plex this week or next (public within thirty days after) – at least a framework for it.
Check constraints are closely related to defaults because both require parsing TSQL. Turns out, over the years folks have been primarily interested in defaults of “now”, new guids, and raw values. Today or Now are pretty easy because it’s just a straight up translation between a SQL function and a .NET function. Any straight up translations like that can be defined in sort of a metametadata (hate that phrase) layer. I handle all three of these scenarios in my metadata extraction tool (a metadata extraction tool will be part of the CodePlex project).
I think validation should be stated in the business layer in rules. I wasn’t doing this five years ago so the whole process of incorporating validation from check constraints will be vastly simpler. Instead of code to code, you need to recognize a category – such as a bound range (the most important) and parse out the bounds into a structure usability by a specific rule. Then another rule is “there’s a check constraint and I think you need to validate based on it, but I can’t write it so you need to.” The architecture could enforce some code being written in response to that rule. To state the change from five years ago, the metadata wouldn’t contain code but the statement of which rule and its parameters.
Validation in triggers would seem, at least to my weak TSQL mind, to be exceedingly difficult.
So, the basic answer to Mike’s question is “some, but not all of the really important scenarios are covered, and I don’t think you’ll ever cover all scenarios”
I better say it up front, because it will quickly become obvious. I am not a computer science graduate. I have never written a compiler. It was quite a route to get my thinking in line with this particular problem and I’m sure it will evolve further.
As I said in my post on Friday, one part of my solving the VB/C# problem without making unreadable templates is a preprocessor. I struggled with what to call it because the real pattern is – create the VB templates, run the processor to create the C# templates, execute the C# or VB templates. So is it really a pre processor? I am still calling it that as it is before running the C# templates so I ‘m thinking of it as an optional pre-processor.
The result is modified templates. A second set of source code and a second template assembly.
The first decision I faced was how much context I was going to demand for any decision. More context, more sophisticated decisions. You could attempt to build a full syntactic tool that understands the structure of your output code and knows a great deal about what you are accomplishing. This may or may not be possible, and will certainly require restrictions on what template code is legal because evaluating multiple paths will be a nightmare and stray strings can result in legal templates, but won’t provide the same evaluation. You may be able to; I’m choosing not to tackle that and decided on least possible context.
The absolute minimum of understanding about the template being converted is which of a finite set of states you are in. Possible states are:
- Template logic (the code that runs the template)
- Comments
- Code blocks
- Expressions
- Conditional blocks within code blocks
- Possibly additional states around declarations, for loops and using statements
My first attempt was line based. Faster, easier to recognize comments and nearly impossible to ever restructure the line wrapping correctly. Trust me, that route did not go well.
A week ago Friday, nearly in tears, I told my son Ben “Look, I told everybody I could do this, and Carl just posted that show. And I am doing it, except I think the bugs I am facing with end of line issues are not solvable.”
My brilliant son said “Why on earth are you doing it that way – do character by character.”
“What, rewrite the whole thing?” maybe I cried.
The rewrite actually went pretty well, painful as it was to abandon nearly completed code. It was made easier by the fact I really do not care about performance. This is a template translation. The converted templates will be compiled and blazingly fast. I can take a second or two a template to do the translation. Thus I can skip all that compiler theory that I never learned about managing buffers and look aheads and all that. A bit of brute force with the simplest possible RegEx.
I’m basically looking at the entire template as a string. I step character by character through the string doing a substring check starting at the current position. I avoid the dumbest of the .NET mistakes such as copying the substrings unnecessarily and I do concatenate via a string builder so performance doesn’t suck too badly. And I do restrict what I’m looking for to what makes sense in context. But I don’t worry that I am looking at the next handful of characters an excessive number of times.
I start off in the template logic. I output the template logic character by character until I find a character sequence that indicates a new mode. I’m keeping this simple by managing both the modes and the required stack via the call stack. Meaning, when I shift into a new mode, such as the Comment mode I just call a method called TranslateComment. Comments are easy - just change the start character and read to the end of the line for output. I need comments treated differently because a code block in a comment should not be translated.
For now, I’m making the restriction that code blocks – blocks to output – must be exactly <code>stuff</code>. This makes parsing a bit easier than allowing any element name. If I’m in template logic and hit a code block, I know I need to start translating. I start looking for sequences that need conversion Me as a word, If, For Each, End If, Next, etc. This list is pretty short right now, I expect the preprocessor to evolve.
If I’m in a code block and I hit an embedded expression (<%= ) I switch back to template logic mode. This is not precise but its close enough. Characters are output exactly until I hit another code block because this is template logic, not output code. If you concatenate strings in there, you’re toast, but you can call methods that are in the VB/C# namespaces.
There are some special cases around code constructs. I recognize an If block by searching for the Then and taking what’s between as a code expression that needs translation. Wherever I’m translating expressions I just use a simple replacement because it’s really just separate symbols.
The preprocessor is simple and focused on what’s actually needed, not boiling the ocean. It will evolve as far as it needs to, staying well shy of both the power and usability issues of the CodeDOM – we just don’t need that for business templates in VB and C#.
Whew! I could write tons more on glitch little details of this preprocessor that’s really eaten my last couple of weeks. It’s one of the pieces I want to get Open Source early on.
I’ve been catching up on blogs and ran across this from Zlatko from Dec. 14.
His basic point is that EF is more than an OR/M mapper because it works in a conceptual space between the object layer and the database – creating a third layer.
I’m very happy that Zlatko said this. It articulates something I’ve never articulated well. The metadata is not a representation of the object layer – it is a way of thinking described in metadata that can be thought of as entities, or abstractions, or something else rather vague and fluffy – see I have problems explaining it.
Entity Framework does pins down this previously mind based abstraction. It’s a subtle shift but it exposes how we think about objects, and now gives us a word for it – the conceptual model. It takes what was previously a mind cloud that we shared by implication from metadata definitions and makes it something we can visual in a drawing.
Unless I’m entirely missing the point though, I do not buy that the existence of this layer is new. I think most or all of us that do metadata based code generation have been doing this for years.
But it is not trivial and it is important to articulate and create a visualizer for something that we’ve just been doing between our ears by implication. It’s part of what makes the implications of EF for metadata for all code generation significant.
The EF conceptual and metadata layers are important even though its current incarnation comes up a bit short in richness and in ease of access. We can fix both these with some effort – I’m loving the moment in time we’re living in and just wishing I had twice as much time to work each day.
Do we all live in fear of that moment when we notice that we’re the one on the other side of Dilbert? When Dilbert is wise and well, we’re not.
Two weeks ago I was writing a long paper explaining some nuances about the state of the templates at that time and asking my client not to reject it until he had looked into it and really understood it. So, in the next morning’s Dilbert strip someone comes to Dilbert and says “I’ll tell you my idea if you promise not to reject it until thinking about it” and Dilbert says “I already rejected it because only putrid ideas come with warnings”
So I spend the better part of the weekend rationalizing that my idea really doesn’t fall into that category.
And then I spring out of bed at 6AM Monday morning (I sort of wish that part was a joke) with the solution. So, let’s look at the problem today and the solution in the next post:
Yesterday’s code was:
Private Function MemberGetPrimaryKey() As String
Return OutputFunction( _
Symbols.Method.GetPrimaryKey, _
Scope.Protected, _
MemberModifiers.Overrides, _
ObjectData.PrimaryKey.NetType, _
Function() _
<code>
Return m<%= ObjectData.PrimaryKeys(0).Name %>
</code>.Value)
End Function
It’s easy enough to ditch the Return statement with a constant. I put these constants in a class and imported the class, which allowed me to directly access the constant, although it was in a different file:
Private Function MemberGetPrimaryKey() As String
Return OutputFunction( _
Symbols.Method.GetPrimaryKey, _
Scope.Protected, _
MemberModifiers.Overrides, _
ObjectData.PrimaryKey.NetType, _
Function() _
<code>
<%= returnString %> m<%= ObjectData.PrimaryKeys(0).Name %>
</code>.Value)
End Function
How bad is that?
But as Bill pointed out in comments on the last post, if all I’m doing is returning a value, I don’t need the code block at all and can teach the OutputFunction method to do the job. So, switching to a more complex and common example, and remembering that I’m out to solve the C#/VB single template problem to allow a single template for any architecture, I took these concept a few steps further. The result of a more complex method becomes:
Private Function MemberPropertyAccessSet(ByVal propertyData As IPropertyData) As String
Return _
<code>
CanWriteProperty("<%= propertyData.Name %>", true)
<%= OutputConditional("m" & propertyData.Name & " <> value", _
Function() _
<code>
m<%= propertyData.Name %> = value
PropertyHasChanged("<%= propertyData.Name %>")
</code>.Value) %>
</code>.Value
End If
End Function
Which for a single language is the same as merely doing:
Private Function MemberPropertyAccessSet(ByVal propertyData As IPropertyData) As String
Return _
<code>
CanWriteProperty("<%= propertyData.Name %>", true)
If m<%= propertyData.Name %> <<> value
m<%= propertyData.Name %> = value
PropertyHasChanged("<%= propertyData.Name %>")
End If
</code>.Value
End Function
Which would you rather debug? Imagine debugging the templates for an even more complex routine.
This spawned my Dilbert moment. If I have to convince you of the wisdom of this, then maybe it’s not so wise. So, what I jumped out of bed to do was build a preprocessor – converting templates that output VB into templates that output C#. By removing that technical restriction, we can find the sweet spot between reducing typos and obfuscating the code logic. I think that is about where the first and last code fragments in this post are. What do you think?
One of the issues with the code generation templates is that they do not test the syntax of the output as you type. I’m a VB coder, and that would be my fantasy, an editor that told me whether my templates produced valid output as I type. That’s nearly impossible to do, so don’t hold your breath.
In the meantime, you may have code like the following where
Private Function MemberGetPrimaryKey2() As String
Return _
<code>
Protected Overrides Function GetPrimaryKey() as <%= ObjectData.PrimaryKey.NetType %>
Return m<%= ObjectData.PrimaryKeys(0).Name %>
End Function
</code>.Value
End Function
Any typos between the <code> elements will result in dozens or hundreds of compiler errors when the output code is incorporated in your project. This is a pain in the neck to deal with, so anything we can do to have less typos is desirable.
When you create a UI for your users, you limit the number of mistakes the user can make via techniques like combo boxes. We can take advantage of Visual Studio’s editor to do a similar thing.
Your output code has logic within subroutines, functions and properties. While this code is trivial in the example above – just a return statement – your code will generally involve more complex logic. It’s important that you see this logic to evaluate it as you’re maintaining templates. The actual function declaration however, is not logic.
I created methods to output the enclosing declarations, as well as other non-logic based structures. This transforms the code above into:
Private Function MemberGetPrimaryKey() As String
Return OutputFunction( _
"GetPrimaryKey", _
Scope.Protected, _
MemberModifiers.Overrides, _
ObjectData.PrimaryKey.NetType, _
Function() _
<code>
Return m<%= ObjectData.PrimaryKeys(0).Name %>
</code>.Value)
End Function
I’ve spread this out for clarity.
You can still typo the name of the method and the word Return. While you can make a bad selection, you cannot make a typo in anything else. And the parameters of the output function remind you of the types of modifiers that make sense. Under the covers, the OutputFunction creates a FunctionInfo object. While I’m not using it here, the OutputFunction method accepts a paramarray of ParmaeterInfo objects if you’re function needs parameters. Again, you can typo symbol names, but nothing else. Of course since the OutputFunction is within the code active in the IDE, you get full Intellisense, information blocks, background compilation and all the good stuff.
I’m using a lambda expression. In this case, it creates an in line delegate used to output the function body. If this template method becomes unduly complex, you could also use VB’s AddressOf operator to call a separate method as a delegate. In this case, the delegate signature I expect has no parameters and returns a string. Since the <code>…</code>.Value returns a string, it’s an effective delegate.
The FunctionInfo object includes an attribute collection. Thus, any attribute you desire to place on the function can be assigned by explicitly instantiating the function info object, rather than using the helper function.
This is quite similar to the OutputClass and OutputRegion methods I’ve showed earlier, but it takes the idea of using explicit method calls in the template to reduce the opportunity for typos in the output.
Output symbol typos are a problem, and you can avoid this through an enum or constants. You’ll use some of these constants across many templates and there will be a lot of them across your templates so I’d suggest you keep things clean by creating classes that contain your symbols. I created a namespace called “Symbols” and classes for Type, Method, Interface, etc. This gives nice clean Intellisense and makes it easier to find symbols in the constant list. Thus the code above becomes:
Private Function MemberGetPrimaryKey() As String
Return OutputFunction( _
Symbols.Method.GetPrimaryKey, _
Scope.Protected, _
MemberModifiers.Overrides, _
ObjectData.PrimaryKey.NetType, _
Function() _
<code>
Return m<%= ObjectData.PrimaryKeys(0).Name %>
</code>.Value)
End Function
That leaves “Return” as the only remaining opportunity for a typo – which is the subject of tomorrow’s post.
Chris asks:
At what point with code gen / templating do you start to think about doing all this codegen at runtime instead of compile time?
And if we were to be doing it at runtime would be be better served by using a dynamic language such as ruby to program in?
That's a good point. In a perfect world, there would be no need for code generation. We would write nothing but business/domain specific code and everything else would just happen. But for well over twenty years we've been aiming for that perfect world and we seem only a few baby steps closer than in 1987.
In an imperfect world, we have two basic choices, manage an architecture and run a lot of code at design time or manage an architecture and run a lot of code at runtime. Both require a fair amount of configuration.
And both can offer the real long term benefit of switching away from coding code – which is transitory and dying before you even finish coding. We want to switch away from coding code and toward creating metadata which is a true business representation. Of course metadata changes. But it changes at the speed the business changes – not due to artificial technology shifts.
For my effort – I want the extra code run at design time to offer the best possible runtime behavior, including performance. I can extend an architecture I’m expressing in templates far easier than I can extend an architecture I am expressing via an OR/M tool. I also want to debug directly into code specific to the problem at hand – I want to debug through generated code, not an OR/M engine.
The next round of effort at making plumbing simpler from Microsoft is also code generation – Entity Framework. When someone gets the plumbing correct and we truly never need to care, we can turn off the code gen and go direct to whatever structure, however its’ done and completely ignore the problems code gen is primarily used for today. In the long term, we shouldn’t care about anything except the business problem we’re solving.
Dynamic languages certainly change the architecture. I'm looking forward to exploring them as the overall knowledge base expands. But we've been sitting with a pretty dynamic language in our laps for many years, the majority of us have programmed in it, and with the exception of precisely one person - everyone I know varies between mild distaste and downright hatred for it. I think a lot of Javascript's problems have been related to debugging and platform issues, and I realize that there are differences between Ruby/Python and Javascript. However, if we didn't fall in love with the dynamic aspects of Javascript in the last ten years, I remain slightly skeptical about dynamic languages in the next few years. Some of the differences, and our attitudes and skills may change as a result, are that the new languages have a broader platform base and we're increasing our understanding of how to actually use them as opposed to hacking them enough to solve some trivial website detail.
But at the core, the technique for expressing metadata into a working application is not half as important as metadata at the core of the application, however its expressed.
For someone that writes software for a living, I have a remarkably hard time using it. I would not have expected "Filter: Ignore" to display no new comments. Ignoring a filter would be more like showing everything.
Happily I have friends that are as patient as I am confused. Thanks to Bill McCarthy and Susan Bradley (who reset my password which was lost in the bowels of my system and I wanted to switch to Live Writer) my blog is slightly more functional.
My apologies to the folks that wrote comments that I seemingly ignored for the last several weeks. They should be fixed, and please let me know if you have difficulties.
I'm still approving anonymous comments so that will sometimes take a while. Non-anonymous comments go live immediately. Unless I start getting spammed too badly.
More Posts
Next page »