Overcoming issues with WSPBuilder's 'Copy to GAC' command.

While I usually can't stop praising WSPBuilder for being the best choice for SharePoint 2007 development, it has had an issue in it for some time that can cause errors such as System.BadImageFormatException to be raised whenever you use the Visual Studio add-in's 'Copy to GAC' feature.

Why does this occur?


In most cases, it's because 'Copy to GAC' does not gracefully handle DLLs that are either non-.NET Assemblies or DLLs that are not strong named. So, if your project references DLLs in this category or DLLs that are dependent on other DLLs that fit into this category, 'Copy to GAC' will fail to work.


Example:


A common scenario is whenever you reference Microsoft.SharePoint.Publishing.dll because it is dependent on ssocli.dll - a non-.NET assembly. WSPBuilder will attempt to copy any DLL in the manifest to the GAC so it fails on this DLL. Sometimes, setting the 'Copy Local' flag to False in the reference properties will work to resolve the error, but often it does not (I haven't figured out why it seems to work sometimes and other times it doesn't. It seems Visual Studio will continue to copy dependent DLLs to the /bin folder even when you mark Copy Local = False in the primary DLL. Please comment if you have an explanation for this.).


Resolution:


The best solution I've found is to go straight to the WSPBuilder source code. That may sound like a daunting task, but fortunately, it's quite easy to do and only takes a few minutes. Here are the steps:
  1. Go download the latest Change Set from the WSPBuilder source code.
  2. Open the .csproj file under \WSPTools\App\VSAddIn
  3. Open Library > Menu > Commands > CopyToGAC.cs
  4. Find the Execute() method.
  5. About half-way down in the method, you'll see this line of code:
    foreach (string dllPath in dllFiles)
    {
  6. Just inside the open bracket, paste this block of code:
    try {
    AssemblyName asn = AssemblyName.GetAssemblyName(dllPath);
    if (asn.GetPublicKeyToken() == null) continue;
    }
    catch (BadImageFormatException) {
    continue;
    }
  7. Add: using System.Reflection; to the namespace reference block at the top of the class.
  8. Rebuild the project and copy the resulting WSPTools.VisualStudio.VSAddIn.dll into the GAC - overwriting the existing version.
  9. Restart Visual Studio and 'Copy to GAC' should now function without issues.



Hopefully, in the next release of WSPBuilder, this issue will be resolved and you won't have to repeat these steps.

What is a FAST Enterprise Search Project Part 2

Introduction

In my previous blog (Why is FAST Enterprise Search Important) I discussed why is an Enterprise Search project in import? In this blog posting I will discuss what is needed for a successfully Enterprise Search project. This should hopefully give you enough information to anticipate what will be needed in an Enterprise Search project.

What is an Enterprise Search Project?

A few years ago I had to make the transition as a custom application developer to an application server consultant with Microsoft products. Project plans for implementing SharePoint, K2 or BizTalk were really not much different other than you have several new tasks associated to the configuration, integration, sustainability and maintenance of the new application server. Still with application server projects you still have lots of custom artifacts and components that have to be developed. This too is the case with FAST.

When posed the question of what is an Enterprise Search project, I first did not know where to start. I wanted to draw from my past experience. I also knew that Enterprise Search projects can be complex but I did not understand what a search project would entail.

Content Processing and Transformation

Enterprise Search within an organization many complexities. First we have to be able to index content where ever it may be (in a custom database, 3rd party enterprise application server, file share, mainframe, etc.). Custom code may have to be written to facilitate bringing this content over to FAST so that it can be indexed. Knowing this a comprehensive analysis project must be completed to understand all the content/data that is spread across the organization. A common mistake is a company may index bad data and they get the old "garbage in; garbage out" issues. There must be plans for indexing both good and bad data, formatting unstructured data, making data relevant, normalizing data (removing duplicates), etc. We will need to understand the entire life-cycle of that data and how it can be effectively pulled or pushed into the FAST Search index. This is very similar to a data warehouse project however the context is a little different.

An Enterprise Search project is also very similar to a complex ETL project because you will have to create several transformation processes/workflows. The processes must transform the content into a document that can be recognized by the FAST Index. FAST refers to anything in the index as a document; even if the index item comes from a database. A document for FAST is a unique piece of data with metadata which gives it relevancy. FAST provides several out of the box connectors that do this transformation and they provide an API to write custom ones. In many cases you may have to build or extend connectors. Just as important as the ETL pre-processing, there is post-processing routines that must be executed before the search results are passed back to the user interface layer. Again more relevancy rules or aggregation of search results may be incorporated here. I was happy to hear that the FAST team also draws comparisons to an ETL project when discussing what an Enterprise Search project is.

User Interface

Most Enterprise Search platforms like FAST do not have a traditional GUI; it is an Enterprise Search engine that can be plugged into new or existing platforms. FAST does provide several controls that can be integrated into any UI platform but in many cases you will be extending upon or building complete new controls. FAST provides a rich API that is accessible in such languages and .NET, Java and C++.

User Profile

An important element of the FAST Enterprise Search project is to understand the user profile that is performing the search. Things such as their current location, where they are within the organization, what sort of specialties do they have, what types of past searches have they done, who have they worked for work for, and past or future projects, tasks or initiatives they have supported can all be used to give a more relevant search result. This requires integration to go to systems that can infer these relationships and pass this information along with the query to FAST Query and Results server which will return a relevant result.

Security

The profile is also important for incorporating security. FAST has numerous ways in which documents can be securely exposed to the end user. For instance there is an Access Control List (ACL) which is part of the document instance in the search index. The ACL is populated during the indexing of content and this may require customizations to set the ACL appropriately. As well, more customizations may be added to do real-time authorization to ensure that documents being returned from the index have not been removed from the user's visibility. Another consideration is to partition indexes based on boundaries such as internet, extranet and intranet. There are several more considerations that must be accounted for so time must be accounted for in the plan to ensure that content is managed properly.

Installation and Configuration

A major portion of the project plan needs to be devoted to the installation and configuration of the FAST server. There are several important things that need to be accounted for when doing this. For instance how many queries will be executed concurrently, what are peak usage scenarios, how much content will be indexed, what sort of complexities/exceptions are there in the indexing process, what is the anticipated growth, etc. All of this must be known for us to properly scale the FAST server and the design of custom components.

Testing

With all of the custom transformation and GUI components to support the Enterprise Search implementation, there will need to be a focus on system integration testing, system application testing, and user acceptance testing. There will be specific test for search to ensure that indexing, query performance and result relevancy are accurate and within acceptable ranges. This is nothing new but we need to be sure that a proportionate amount of time is incorporated into the plan to ensure that a quality solution is put in place.

Sustainment and Governance

Sustainment next needs to be part of the plan which is commonly neglected. Too often the plan is focused on the short-term end result while the long-term management is not incorporated into the solution. What sort of organizational management changes are required to support and maintenance of the search implementation? What sort of configuration management business processes will need to be introduced to continually tune the index and relevancy model based on usage? What sort of new roles and responsibilities need to be incorporated into the employee performance (from both a systems and business user perspective)? How is the enterprise taxonomy going to be maintained? What sort key performance metrics and reporting are needed to consistently evaluate the success of the project? What is the process for incorporating change back in the solution (which is extremely important for Enterprise Search)? If questions like these are not incorporated into the early design of the project, there will be long-term challenges with the adoption and integration of the Enterprise Search investment.

Closing

As you can see the key to a successful Enterprise Search project is to understand the needs of the business and how the solution will be supported. Many of the tasks that were discussed are very standard; we just needed to put them in context.


Post orginially written by RDAer Jason Apergis at MOSS & K2 Distillery

Why is FAST Enterprise Search Important Part 1

Introduction

The first thing that many will ask before beginning a major Enterprise Search initiative with a product like FAST is why is an Enterprise Search important? Secondly, what is an Enterprise Search project? My approach is to not understand these questions this from a sales perspective but from a technology management and consultant perspective.

Why is Enterprise Search important?

Users have to work mass amounts of data that is either stored internally or externally. Search can mean lots of things to different industries however the goal is simple; it is to display the right information to the right person at the right time without distraction. At the same time we must have a flexible and configurable search platform that will surface the most relevant information to the business user from where it is stored.

Information Workers have to search and then utilize data. How do they do this? They typically have to log into an application and perform a search. Or when they enter an application, there may be some data contextually rolled up to them based upon who they are. There is a demand by business users to make search easier. We have heard many times "how can I search my enterprise data in the same way I Google something on the internet". Users want the ability to go to a single place, run a search query and receive results from across the entire enterprise. This is very different than performing a public internet search or a search function contained within the scope of a single application. Public internet searching has its own complexities however it typically is indexing content on websites. Enterprise Search becomes complex because the data being indexed can come in numerous formats (document file, database, mainframe, etc). From the user perspective this complexity must be transparent. They must be given a single result set that will allow them to research problem, complete task or even initiate a business process.

Organizations are challenged with providing comprehensive search solutions that can access content no matter where the data resides. Public search engines have as well created demand to provide highly relevant search experiences. Relevancy is the key to success for a search solution. To have accurate relevancy it is important to know as much as we can about the user entering the query. Profile relevancy can be determined a by numbers of things. For example where the person is located, what is their job function, and what past searches have they or colleagues done. Relevancy can also be determined by the attributes associated to a piece of content. For example is the author considered to be trusted, is the content itself fresh, or even is content highly recommended by other users. The search platform must have an adaptive relevancy model. It must be able to change based on business demands and subsequently learn how to provide better results utilizing factors that are incorporated into the relevancy model. An Enterprise Search platform like FAST can provide this advanced capability.

The vision of going to a single place find data is not really a new concept. We have seen a major push for data warehouses to create a single location to facilitate enterprise reporting. We have seen enterprise portals created which give users a single user interface that provides contextual data from disparate systems. We have seen SOA trying to consolidate business services and now we are seeing cloud services gaining traction in the market. The reality is that the enterprise architecture on the large will be disparate. Companies have made significant investments into many technologies at one time or another and consolidating them to a single platform is not always realistic. This is why we are constantly trying to find new solutions to work with data in a uniform manner. This is an important justification for an Enterprise Search solution such as FAST.

To restate, the goal is to have an Enterprise Search platform that can create single result set using disparate data from across the enterprise. Where a lot of organizations fall short is they do not have the tools to navigate this data. Business users are required to have deep domain knowledge of the organization, format of the data, and business processes. The domain expert must know what is good or bad based upon experience which is not transferrable making continuity of operations challenging. This is yet another reason why an Enterprise Search platform provides significant value to an organization.

Here are some examples of how organizations have used Enterprise Search.

  • Several major ecommerce sites like Best Buy and Autotrader.com used FAST to better advertise to its customers, expose product significantly quicker to the customer, provide better navigation of search results and provide integration with OEM partners.
  • A business data brokerage firm was able to provide more relevant results, increase user satisfaction, provide data from multiple disparate locations, create better customer retention, created collaborative data rating system and allowed for communication between subject matter experts.
  • A community facilitator for the natural resource industry was able to create a B2B solution that provided dynamic drill/navigation of industry data, created automate extraction policies to mine for important data, was able to regionalize their search results, created a pay model for more high-end results, and improved their sales model by using relevancy.
  • A major computer production company used FAST to improve economies of scale for support personnel. They significantly lowered call-center cost by directing users to search first, provided customers with more up to date support information and allowed their worldwide staff of engineers to user their native languages when performing a search.
  • A global law firm used FAST to create a knowledge management solution that allowed them to reduce research personnel and created consolidated search experience. They significantly reduce ramp-up time of new lawyers, greatly improved relevant results with advanced content navigation, and provided better communication of best practices.
  • A law enforcement agency was able allow investigators to electronically research mass amounts of data across the government which they normally did not have access to. This subsequently increased productivity, shortened lengths of investigations and help them comply with government regulations.
  • Another government agency created a solution using FAST which would search public domain for information of persons who are potentially breaking laws and initiate business processes bring them to justice.

All these examples provide strong justifications for the value of an Enterprise Search solution. With FAST costs were reduced, they were able to meet regulations, they performed more efficiently, and generated more revenue for goods and services.

What is an Enterprise Search Project?

This will be discussed in my next blog What is a FAST Enterprise Search Project

Post orginially written by RDAer Jason Apergis at MOSS & K2 Distillery

Effective SharePoint Governance

Recently, a colleague asked what I thought the keys were to a successful SharePoint installation. The first thing that came to mind is good governance.

Getting SharePoint itself installed and configured on some servers is not really the big challenge with SharePoint. Getting Governance right is.

In this post, I'm going to draw out a few critical pieces of SharePoint governance planning. By planning ahead for these things, we can roll out a more effective SharePoint environment for business users, as well as reduce potential support headaches down the road.


Educate your stakeholders

When asking business or IT stakeholders to make decisions regarding a SharePoint deployment, it's crucial to ensure that they understand what it is they are deciding. My governance documentation always includes educational sections on, for example, the difference between a site and site collection, or the options available for securing SharePoint. Don't expect good decisions from poorly informed stakeholders.

Document and Communicate Plan for proper governance

Obviously, your governance has to be documented. But how it's documented and communicated is critical to the acceptance of the new platform. I recommend breaking documentation up into the following documents:

Governance Manual - describes how governance is maintained, who can update documentation, who is on the governance board, what other documentation exists, etc...

Information Architecture - describes the "Types" of sites, how they are to be used, high level policies regarding the different types of sites, how content is to be maintained within the different types of sites, etc... (this can be broken into more detailed documentation depending on how "deep" the taxonomization goes.)

Operational Manuals - for each group that interacts with SharePoint, a manual should be created. These manuals detail the roles, responsibilities, policies and procedures. Typically, the following groups would be considered:

  • Farm Administrators
  • Site Administrators
  • End Users
  • Help Desk (or SharePoint Support Team, depending on how support is structured)
  • Server Operations (would include information about what to monitor, how to respond to various events, like the app server going down.

Support Manual - For every type of support request that can be anticipated, this document describes how the request is handled and passed between different supporting actors. This document ties together the Operational Manuals mentioned above.

Training and Communications plans - This is quite important. Without proper training, your end users will not get the most out of your SharePoint deployment. Site Administrators or power users should get some level of administrator training. This will allow them to make the most of collaboration features, like custom lists and views, that would otherwise go unused. Consider a "train the trainer" approach, as well as less intrusive cheat sheets and CBTs to get your users up to speed. As far as a communications plan, every organization is different, but it's always helpful to have buy in (or at least generate a little excitement), and keeping a clear line of communication is critical.

Rollout Plan - One of the most important areas to consider (though often given the least attention) is how you actually roll out new sites. There are a few things to consider here: Who gets access and when? What is the process for requesting a site? Are approvals required? Maybe the process is different depending on the type of site.

Regarding how sites are created, a word of advice: Don't just give people team sites and tell them their site is ready. When creating a new SharePoint site for a department or team, include basic user interviews. Questions like "Are there any excel workbooks that your group uses to keep track of things?" will help you figure out what custom lists to create, for example. End users will need help if they are going to make the most many of SharePoint’s features.

Monitor adherence to governance and maintain a governance board

Don't forget to convene a governance board that will make final decisions regarding the SharePoint deployment. Often, items like quota exceptions or special purpose SharePoint sites are approved by the governance board, in addition to policy or procedure changes. The board should meet at least monthly, and should be provided with reports indicating whether governance is being adhered to. The governance board may choose to reprimand users who are violating governance, or simply have the operations team correct the problem. Without the governance board, your governance will have no teeth.

FAST Introduction

Great post from RDAer Jason Apergis regarding FAST search technology:

http://www.k2distillery.com/2009/10/fast-introduction-and-sharepoint-search.html

Deciding which authentication mechanism to use for your SharePoint Extranet

When companies implement an Extranet, one of the main design decisions is: How will users authenticate?

SharePoint supports custom authentication providers, so the options are basically limitless. However, almost all solutions will involve some sort of database for storing user profile information. The 3 most common are Active Directory, a custom SQL database, or ADAM.

ADAM is a lightweight LDAP provider - similar to Active Directory. It is the default provider used by the External Collaboration Toolkit for SharePoint (ECTS) - a solution accelerator from Microsoft that can assist in quickly setting up an Extranet. It helps configure a new Web Application in SharePoint that allows internal users to authenticate via AD and external users to authenticate via ADAM.

So, why might ADAM be preferred over a normal SQL database or just adding external users to AD? Below, I have listed some of the options to consider when deciding on an approach:

ADAM vs. SQL

  • ADAM is LDAP compliant and can more easily integrate with the profile service in SharePoint (to sync ADAM profile information into SharePoint) (Winner: ADAM)
  • ADAM has a better built-in UI for managing user accounts. (Winner: ADAM)
  • ADAM is more complicated to setup that regular SQL based FBA (Winner: SQL)
  • SQL may be more appropriate for solutions you want to sell or deploy multiple times because the SQL setup can be scripted where-as the ADAM installation would have to be a manually installed pre-requisite. (Winner: SQL - just barely)

ADAM vs. AD

  • Storing accounts in ADAM keeps your normal AD free of accounts for external users and generally cleaner.
  • ADAM also lowers security risk to the enterprise because you're separating external users into a different space. In other words, you don't have to worry about inadvertantly granting access to other resources such as the ability to login to a workstation or access HR applications to external users because of rules based on existing AD assignments.

In general, there are many benefits to using ADAM over AD or SQL. If it makes sense for your solution, definately consider using the ECTS to assist in setting up your Extranet environment.

Setting a Content Type's ID Programatically

I recently had to implement a system where a "generated" document library in one site collection was configured by reading a "template" document library in another site collection. The scheme worked as follows:

  • For each folder in the "template" library, create a folder in the destination library
  • For each file in the "template" library, create a content type that is only available within that file's respective folder, and have that content type point to the file in the "template" library, so the template files themselves are not copied around.

There were a few tricky pieces to implementing this. The first is being able to limit a content type to a specific folder in a document library. Although this can't be configured in the SharePoint UI, it is possible to do by setting the UniqueContentTypeOrder property of the SPFolder object:

http://msdn.microsoft.com/en-us/library/microsoft.sharepoint.spfolder.uniquecontenttypeorder.aspx

This property expects a List of SPContentType, and the list must contain at least one item. Simply set this property to a List of SPContentType objects that you want to make available in that folder. The content types must already by added to the folder's parent list.

The second tricky piece involved the way that docx files interact with document libraries and content types. It works like this:


  1. If you use the new item menu to create a new document from one of our generated content types, SharePoint would supply MS Word with the template file, from the "template" doc lib (because we centralized these templates).
  2. The content type of that file (in the "template" doc lib) is just "Document", and that content type id is actually written into the contents of the docx file. (When you update a document's content type using the SharePoint UI, SharePoint will actually crack open the file and update the content type id inside.)
  3. When Word saves the file back to SharePoint, the file is saved with a "Document" content type id, NOT the content type we selected from the new item menu, because the template file is a "Document", not one of our generated content types.

So this is a problem, as we want these document's typed correctly.

As a workaround, we decided to create content types in the template library that match the names of the files, and then duplicate those content types into the other site collections. That way, the template file would actually have the proper ContentTypeID embedded into it. I knew we could create content types with prespecified IDs using Features, so I figured it would be a piece of cake to do it in code. I was wrong.

There is no way to programatically set a Content Type's ID without using reflection. The ID property is readonly, and there is no way to specify the value when creating a new content type. Without being able to set the ID of our "generated" content types, this scheme wouldn't work. After pouring through the SharePoint Object Model using .NET Reflector, I found the answer:

In order to set the content type's id, you must set it within the SPContentType instance after creation, but BEFORE you add the content type to a collection. The content type ID is a private member variable, but you can update it using reflection.

First, I defined a function that allows me to set the value of private member variables:


public static void SetPrivateMemberRefType(ref TObjectType o, string variableName, TValueType Value)
{
Type t = o.GetType();
FieldInfo fi = t.GetField(variableName, BindingFlags.NonPublic BindingFlags.Instance);
if (fi != null)
{
fi.SetValue(o, Value);
}
}



Then I make use of that function to set a content type's id:



//create the content type (it will have an autogenerated id after this line that we will replace)
SPContentType ct = new SPContentType(ct, web.ContentTypes, sName);


//set the id using reflection
SetPrivateMemberRefType(ref ct, "m_id", new SPContentTypeId("MyPreSpecifiedContentTypeID"));


//add content type to web
web.ContentTypes.Add(ct);



The trick here is that the ContentTypeID is autogenerated by the object model before it even goes into the database, so changing the value internally before actually adding it to the DB just works. You do have to follow the standard rules for content type IDs, however, or your Content Type schema will get hosed:

http://msdn.microsoft.com/en-us/library/aa543822.aspx

The final code looks something like this:


SPFile file = my source "template" file;

//create a new web-scoped content type and copy the content type id of the source
SPContentType ct = new SPContentType(destinationWeb.ContentTypes["Document"], destinationWeb.ContentTypes, file.Title);
SetPrivateMemberRefType(ref ct, "m_id", sourceWeb.ContentTypes[sName].Id);
destinationWeb.ContentTypes.Add(ct);

//set the doc template to the original file
ct.DocumentTemplate = sourceSite.MakeFullUrl(file.Url);
sct.Update();

//create the list scoped content type, again, copying the content type id of the source
ct = new SPContentType(ct, destination.ContentTypes, sName);
SetPrivateMemberRefType(ref ct, "m_id", sourceWeb.Lists["Templates"].ContentTypes[file.Title].Id);
destination.ContentTypes.Add(ct);

Profile Import and Audience Compilations Not Running

I recently ran into a strange issue with a very basic SharePoint Farm consisting of a single Web Front End and a single Application server. Everything seemed to be running fine and then one day User Profile imports and Audience compilations simply stopped running. They could both be run manually from the SSP administration screens just fine but they never ran at their scheduled times. It was as if the scheduled timer jobs did not even exist. After examining the hidden timer jobs using this technique, I determined that the jobs were fine. Combing through the logs revealed only one recurring error:

Configuring the Search Application web service Url to 'https://appserver:56725/Search/SearchAdmin.asmx'.    
Exception caught in Search Admin web-service proxy (client). System.Net.WebException: The underlying connection was closed…

The strange thing was I had never configured SharePoint to use SSL for the web services. It turns out that internally, SharePoint uses SSL to hit these web services and the SSL along with the accompanying certificate are configured when the product is installed. Attempts to browse to that URL got me no response. Finally after more searching, I turned up this KB Article. In the end, the blame falls on .Net 3.5 Service Pack 1 which can cause the self issued certificate to become corrupted. The fix, as described in the KB article does work, the downside is that is requires the IIS 6 Resource Kit. In any case, if you run into a similar issue, now you know the fix.

Optimal settings for SharePoint Diagnostic Logging

Most environments I've seen either don't turn on diagnostics logging at all or they turn everything to Information/Verbose - which can create huge logs and volumes of information to sift through.

I took some time to figure out what I think are optimal settings for diagnostic logging - so that you get only the important messages. Here are my settings:

In Central Administration -> Operations -> Diagnostic Logging:








CategoryEventTrace
AllErrorHigh
MS Search AdministrationErrorMonitorable
Setup & UpgradeErrorMonitorable


That seems to always give me just what I'm looking for.

Resolving issue with multiple Contact Detail web parts on the same page

Whenever you attempt to add more than one Contact Details Web Parts to the same page in SharePoint, only the first one will properly implement the drop-down box and the contact status on the "skittle" / "jellybean".

The problem is that the Contact Details Web Part writes down the image of the jellybean using a hard coded "id" value. Therefore, when you add more than one, you have multiple objects with the same "id" in the HTML and the javascript gets confused.

To workaround this, drop a Content Editor Web Part on the same page and paste in the following code:

<script type="text/javascript" language="javascript" for="window" event="onload">

var presence = document.getElementsByName("imnmark");
for(var i = 1; i < presence.length; i++) {
    var element = presence[i];
    element.id = "contact_im" + i + ",type=sip";
    element.onload();
}

</script>


That should fix everything up.