Uncategorized | Malte Clasen

If you have ever programmed for SharePoint, you probably had a look at the content database, just out of curiosity. And if you did so, you probably closed it and never looked at it again, because you can’t stand the outstanding abuse of SQL as a data store:

Yes, the columns are named “nvarchar1”, “ntext1” and so on. If you ever wondered why SharePoint has some strange limits on the number of variables on a list, now you know why. There are just so many columns of the respective primitive type in the user data table.

SharePoint tries to map non-relational data to SQL. On the one hand, it supports multiple Content Types per list, so each item can have a different set of attributes. On the other hand, it does not expose transactions, joins, or any other feature SQL developers use on a daily basis. It looks like they just had to find some kind of data storage which is accessible over the network and can be shared by multiple instances of SharePoint. In this regard, SQL Server seems like a reasonable choice, at least when your only alternative in the Microsoft product stack is SMB/CIFS.

Nowadays it looks to me as a great opportunity to use a document database as backend, such as MongoDB, RavenDB or CouchDB. If SharePoint List Items would correspond to documents, there was no need for mapping attributes to enumerated columns, they could be stored as is. SharePoint exposes nothing which could not be done on any of these NoSQL databases. I know that it is unreasonable to expect Microsoft to support a NoSQL backend in an upcoming version of SharePoint. But I would like to spread the word to show SharePoint developers that there is a sane world out there. And if they are ever in the position to design a system, they should evaluate all possible backends to find one which matched the concepts of the system, not just one which can be forced to do the job.

System.Reflection.AssemblyName is a class. From a pragmatic point of view, the discussion ends here. Classes in .net are usually used to represent entities, not value types. Yet in this case, the concept behind an assembly name strongly suggests that it should be a value type. The name itself carries no identity, it only denotes an entity. Classes and Structures in C# are used using the same syntax, so if you don’t read the manual, it’s not obvious that AssemblyName is a class. This leads to interesting questions about how to handle equality. The developers of the .net framework chose not to override the Equals() method, so the default method comparing object identity is used. Therefore you can create two instances of AssemblyName with equal attributes for which Equals() returns false. While this is a technically valid choice, I consider it a violation of the principle of the least astonishment. Maybe it is because it took me quite some time today to debug an issue based on a broken usage of the Equals() method. And there a quite a few value types in the framework which work as expected, such as System.DateTime.

So how can you prevent making this mistake yourself? I consider chosing between class and struct to be one of the most important decisions when creating a new type. The Domain-Driven Design book by Eric Evans is a good start for this kind of decision.

Given that there is no rule how to always get it right, what can you do if your misrepresented value type is already in the field, used by dozens of developers? I don’t have a good answer for this one. Marking it obsolete and adding an improved type seems to be the only practical way, apart from ignoring the issue. But it has its draw-backs, such as the perfect name already being taken (how would you call an AssemblyName value type?) and many warnings starting to appear in perfectly working dependent code? I appreciate any comment on this.

Malte Clasen

software development

Category Archives: Uncategorized

SharePoint: Missing NoSQL Backend

AssemblyName: Value Type or Entity?