Saturday, November 27, 2010

#SharePoint2010 #HowTo use Managed Metadata to find related content

Background

The Managed Metadata Service is in my opinion one of the better improvements in SharePoint 2010. It can be used across Sites, Site Collections and Web Apps for a centralized Taxonomy. Term sets can be open to allow end users to enter their own terms. The term set manager allows terms to be merged, reused, deprecated and moved. A term set can be configured to allow tagging so that end users can socially tag any page with a term.

Working for Puzzlepart I’ve seen lots of potential use for Managed Metadata and especially using it to push related content based on context. On the latest Pzl.Friday I finally got the chance to complete the research and complete a simple prototype.

Business Problem

It’s a well known fact that information workers spend lots of time looking for information, by some estimated to as much as 10% of the working time. Although Search Engines such as the built-in search in SharePoint 2010 can reduce the amount of time spent looking it still requires an active act by users. It’s all about what you didn’t know that you didn’t know, and how to make you aware of that information in the context that you’re in. This is a huge value proposition for information workers.

Although there has been some investment in Social Tagging in SharePoint 2010 the tag profile pages are limited to Social Tags and will not leverage the even stronger term usage when an explicit term is used in a  MetaData context.

Personally I would appreciate a lot more a solution where I based on the context (page) I’m watching also see other related pages or documents based on metadata relevance, which I will prove possible below.

Example: looking into the Enterprise Wiki Page for SharePoint Development I also get information about SharePoint Projects, a document about licensing and other SharePoint related wiki pages.


Solution

I won’t go into details of how the Taxonomy fields are actually handled in SharePoint 2010, for that I recommend reading Wictor Wilén’s Dissecting the SharePoint 2010 Taxonomy fields. I found very little information about how to find contents related by taxonomy fields but this forum post helped me a lot in understanding how it all fits together How do I get SharePoint items tagged with a term from the term store.

To achieve the goal the following tasks need to be done:
  1. Identify the taxonomy fields and retrieve term ids
  2. Getting WSSIds for each term in each field
  3. Querying for related information based on WSSId s
  4. Optionally qualify the relevance further
  5. Render the result
In the solution below I’ll use Enterprise Wiki pages to find related contents in the Site Collection based on the Wiki Categories which is a default  Managed Metadata column in the pages library.
Identify the taxonomy fields and retrieve term ids
The code below will pick the standard taxonomy field based on a static Display Name. The field(s) could however be picked based on the Context of the File.
  1. private static IEnumerable<Guid> GetTermIdsFromContext()
  2. {
  3.     var metaDataField =
  4.         SPContext.Current.File.Properties["Wiki Page Categories"];
  5.     var termIds = new List<Guid>();
  6.     if (metaDataField != null)
  7.     {
  8.         string[] tagIdCandiadates =
  9.             metaDataField.ToString().Split(new char[]
  10.                         {
  11.                             TaxonomyField.TaxonomyGuidLabelDelimiter,
  12.                             TaxonomyField.TaxonomyMultipleTermDelimiter,
  13.                             TaxonomyField.TaxonomyTermPathDelimiter
  14.                         });
  15.         foreach (var idCandiadate in tagIdCandiadates)
  16.         {
  17.             if (IsGuid(idCandiadate))
  18.             {
  19.                 termIds.Add(new Guid(idCandiadate));
  20.             }
  21.         }
  22.     }
  23.     return termIds;
  24. }
Supporting method

  1. private static readonly Regex isGuid =
  2.     new Regex(
  3.         @"^(\{){0,1}[0-9a-fA-F]{8}\-[0-9a-fA-F]{4}\-[0-9a-fA-F]{4}\-[0-9a-fA-F]{4}\-[0-9a-fA-F]{12}(\}){0,1}$",
  4.         RegexOptions.Compiled);
  5.  
  6. private static bool IsGuid(string candidate)
  7. {
  8.     if (candidate != null)
  9.     {
  10.         if (isGuid.IsMatch(candidate))
  11.         {
  12.             return true;
  13.         }
  14.     }
  15.  
  16.     return false;
  17. }

Getting WSSIds for each term in each field
WSS Id’s are the actual lookup Id’s which can be used in a SiteDataQuery to get related conent. The code below will get the Wss Id’s using the Taxonomy Session.
  1. private static IEnumerable<int> GetWssIdFromTermId(IEnumerable<Guid> termIds, bool includDescendants, int itemLimit)
  2. {
  3.     var wssIdList = new List<int>();
  4.     foreach (var taggId in termIds)
  5.     {
  6.         var session = new TaxonomySession(SPContext.Current.Site);
  7.         var term = session.GetTerm(taggId);
  8.  
  9.         wssIdList.AddRange(
  10.             TaxonomyField.GetWssIdsOfTerm(
  11.                 SPContext.Current.Site,
  12.                 term.TermSet.TermStore.Id,
  13.                 term.TermSet.Id,
  14.                 term.Id,
  15.                 includDescendants,
  16.                 itemLimit));
  17.     }
  18.     return wssIdList.Distinct();
  19. }

Querying for related information based on WSSIds

Queries both custom list and document libraries and build up XML content. I chose building it as an XML string so that It will be easier to customize the visual appearance using XSL later on.
  1. private void GetRelatedContents(IEnumerable<int> wssIds)
  2. {
  3.     foreach (var wssId in wssIds) //Optimization possible
  4.     {
  5.         try
  6.         {
  7.             DataTable listResults = QuerySite(wssId, 0);
  8.             if (listResults != null)
  9.             {
  10.                 foreach (DataRow row in listResults.Rows)
  11.                 {
  12.                     _xmlTextWriter.WriteStartElement("Content");
  13.                     _xmlTextWriter.WriteElementString("Title", row["Title"].ToString());
  14.                     string relativeUrl = row["FileRef"].ToString();
  15.                     //Get the actual dispform url
  16.                     relativeUrl = relativeUrl.Split('#')[1].Replace(row["LinkFilename2"].ToString(),
  17.                                                                     string.Format("DispForm.aspx?ID={0}",
  18.                                                                                     row["ID"]));
  19.                     string url = string.Format("{0}{1}", row["EncodedAbsUrl"], relativeUrl);
  20.                     _xmlTextWriter.WriteElementString("Url", url);
  21.                     _xmlTextWriter.WriteEndElement();
  22.                 }
  23.             }
  24.             DataTable documentResults = QuerySite(wssId, 1);
  25.             if (documentResults != null)
  26.             {
  27.                 foreach (DataRow row in documentResults.Rows)
  28.                 {
  29.                     string relativeUrl = row["FileRef"].ToString();
  30.                     relativeUrl = relativeUrl.Split('#')[1]; //Get the actual url
  31.                     string url = string.Format("{0}{1}", row["EncodedAbsUrl"], relativeUrl);
  32.  
  33.                     string title = row["Title"].ToString();
  34.                     if (string.IsNullOrEmpty(title)) // Some content will not have a title
  35.                     {
  36.                         //Get title from filename without extension
  37.                         title = row["LinkFilename2"].ToString().Split('.')[0];
  38.                     }
  39.                     //Ignore current page
  40.                     if (!url.EndsWith(CurrentPage, StringComparison.InvariantCultureIgnoreCase))
  41.                     {
  42.                         _xmlTextWriter.WriteStartElement("Content");
  43.                         _xmlTextWriter.WriteElementString("Title", title);
  44.                         _xmlTextWriter.WriteElementString("Url", url);
  45.                         _xmlTextWriter.WriteEndElement();
  46.                     }
  47.                 }
  48.             }
  49.         }
  50.         catch (Exception) {/*forget about it for now}*/}
  51.     }
  52. }
  1. private static DataTable QuerySite(int wssId, int baseType)
  2. {
  3.     var query = new SPSiteDataQuery
  4.         {
  5.             Query = String.Format(
  6.                 @"<Where><Eq><FieldRef Name='TaxCatchAll' LookupId='TRUE' /><Value Type='Lookup'>{0}</Value></Eq></Where>",
  7.                 wssId),
  8.             Lists = string.Format("<Lists BaseType='{0}' />", baseType),
  9.             ViewFields =
  10.                 "<FieldRef Name='ID' /><FieldRef Name='LinkFilename2' /><FieldRef Name='Title' Nullable='TRUE' Type='text' /><FieldRef Name='FileRef' /><FieldRef Name='EncodedAbsUrl' /><FieldRef Name='AverageRating' Nullable='TRUE'/>",
  11.             Webs = "<Webs Scope='SiteCollection'/>"
  12.         };
  13.     return SPContext.Current.Site.RootWeb.GetSiteData(query);
  14. }

Putting it all together

Once the basic methods are in place it’s all a matter of calling the methods and transform XML to Html using XSL. All this is available in the SourceCode which you can download from Puzzlepart at the download link below.

The end result is shown here as a WebPart embedded in the page, but it could easily be moved to a layout page or master page and be displayed in every site.
image
Download the full SourceCode

Potential

While I’ve proven how to get related contents based on the current context some algorithm should be applied to tune relevance. This could be done based on rating, modified date or even testing if similar social tags exist on the same result.

One can also use social tagging as a source and mix the result based on both social tagging and Managed taxonomy fields.

Enjoy.

10 comments:

  1. Thanks for providing nice information about Sharepoint meta data with coding example. Its really helpful for sharepoint developers to use this code example for their application

    ReplyDelete
  2. Thanks for the feedback. Sharing is caring :-)

    ReplyDelete
  3. Your source code link is not working. I would really like to see the complete example.

    ReplyDelete
  4. Mikael, the sourcecodelink is now functional again

    ReplyDelete
  5. Great! Thanx a lot for the swift reply!

    ReplyDelete
  6. Thanks for a great article! I implemented your code on my dev share point server and was able to deploy it successfully. However, no content is appearing on the related content web part. Does it take a while to show up? I have two wiki pages within 2 sites with identical terms for the web part to pick up. Thanks again for sharing the source code!

    Dinesh

    ReplyDelete
  7. This is nice and informative post, Thanks to post.

    Sharepoint Development

    ReplyDelete
  8. Very nice post, but the download link seems to be dead. Could you provide a new mirror?

    ReplyDelete
  9. This post is realy good, but download link is not working, can you please post the updated link for the code (or) any one can share the share the files.. Thanks in Advance..:)

    ReplyDelete