Thursday, January 26, 2012

FIM SQL Server MA (or PowerShell for SQL Table-Valued Parameters)

 

Working on a Proof of Concept for FIM has been a refreshing visit to some of my old haunting grounds.  Long before AD existed (or even NT) my first real job was for a database consulting company.  It no longer exists, but its legacy can be found at www.noetix.com; and some of the people I knew back then are still there.

The FIM PoC begins with a text-based authoritative data source with no natural key.  The FIM text-based MAs require a key, so I could not use those MAs and still support user account renames.  So I decided to import the data into a SQL Server table with an Identity column and use the SQL Server MA.  This decision has brought me a fair amount of SQL work that I have not done for years.

The first fun task was generating the delta view.  I opted to follow the trigger approach, but wanted it to act more like Active Directory DirSync where one can retrieve the deltas based on a cookie being held and presented, but the older deltas still exist.  So I created a timestamp-based view of the delta table at MA runtime specific to my particular FIM instance.  The view will present all the deltas from the last provided timestamp plus a 5 minute overlap to handle the Kerberos-allowed time skew.  This approach also allows a parallel FIM instance to receive its own deltas without impact to each other by simply creating a differently-named view specific to that instance.

Then I started working on multi-valued attributes.  Per standard normalization rules for SQL, one ends up with a second table linking the primary object table Identity column to the multivalued attribute name and a single value.  To add a second value to the attribute the table needs another row with the same Identity foreign key, same attribute name and the second single value.

FIM’s granularity for SQL deltas is limited to indicating the object as a whole has changed.  It has no granularity for indicating attribute deltas like DirSync can, so any insert or delete into the multi-valued table triggers a delta of the whole object.  This matches up well with the text-based source as all attributes (single and multi) are encoded into one row in the file.  There is no easy way to know what has changed in the source data when it is received, so I simply need to make sure the SQL data exactly matches the newly received row from the authoritative data source.  Put into practice this means all of the single-valued attributes must be rewritten, all the existing multi-valued attributes must be deleted and the new multi-valued attributes must be inserted.

I was brought up with the stored-procedure methodology for interacting with any database, so my goal was to develop a stored procedure that I could call from PowerShell to take care of the attribute updates.  I split this into two stored procedures: one for the single-valued attributes, and one for the multi-valued attributes that I could iterate over for each multi-valued attribute in the source feed.  The multi-valued stored procedure causes difficulty in the handling of the one delete and “n” number of inserts.  You can’t put that into one stored procedure and support an unknown number of multi-value attribute inserts.  So either it has to be broken into two stored procedures (execute the delete stored procedure followed by n executions of the insert stored procedure), or find a way to get the single stored procedure to recognize an array for the inserts.

The multiple stored procedures didn’t sound elegant, and I haven’t worked on SQL Server for a while, so I spent a little time searching and fairly quickly found Table-Valued Parameters and a great blog entry from Erland Sommarskog, SQL Server MVP.  One of the nice benefits of FIM not being cross-platform is its requirement of SQL 2008 and the ability for me to now use other new SQL 2008 features.

Begin by creating a User Defined Table Type:

CREATE TYPE [dbo].[multivalue_list] AS TABLE(
    [attributevalue] [nvarchar](1024) NOT NULL
)

Create a stored procedure that uses that type[1]:

CREATE PROCEDURE
    [dbo].[import_people_multivaluesdata]
          @p_cn varchar(64)
        , @p_attributename varchar(64)
        , @p_multivalue_list multivalue_list READONLY
   
AS
BEGIN
    -- SET NOCOUNT ON added to prevent extra result sets from
    -- interfering with SELECT statements.
    SET NOCOUNT ON;

    -- Insert statements for procedure here
    DELETE FROM
        people_multivalues
    WHERE
            AttributeName = @p_attributename
        and ObjectID IN (
            SELECT
                ObjectID
            FROM
                objects
            WHERE
                cn = @p_cn
            )
   
    INSERT INTO
        people_multivalues
    SELECT
          o.ObjectID
        , @p_attributename
        , mvl.attributevalue
    FROM
          objects o
        , @p_multivalue_list mvl
    WHERE
        o.cn = @p_cn
       
    SELECT
        COUNT(m.objectid) as row_count
    FROM
        objects o
            JOIN
        people_multivalues m
            ON
                o.ObjectID = m.ObjectID
    WHERE
            o.cn = @p_cn
        and m.AttributeName = @p_attributename
END

There are two important things to note in this procedure.  The first is the use of the READONLY directive on the input table parameter.  The second is the FROM clause for the INSERT statement having no JOIN clause.  The incoming table only has the values for multi-valued attribute, so there is nothing on which to join.  This results in the Cartesian product between the two, resulting in one row being inserted for each row in the input table parameter – exactly the goal!

Calling this procedure from T-SQL is fairly straight forward:

declare @mylist multivalue_list
insert @mylist(attributevalue) values ('val1'), ('val2')
exec import_people_multivaluesdata @p_cn = 'cn1', @p_attributename='multiattr', @p_multivalue_list = @mylist

Calling this from PowerShell requires a bit more setup.  Mr. Sommarskog’s sample for ADO.Net (really C# SQLClient) can be translated into PowerShell as follows (switching from int to VarChar, and using our above stored procedure):

$products = "A", "B", "C"

$product_list = New-Object 'System.Collections.Generic.List[Microsoft.SqlServer.Server.SqlDataRecord]'

$tvp_definition = New-Object Microsoft.SqlServer.Server.SqlMetaData ("attributevalue", "VarChar", 1024)

ForEach ($product in $products) {
    $rec = new-object Microsoft.SqlServer.Server.SqlDataRecord($tvp_definition)
    $rec.SetSQLString(0, $product)
    $product_list.Add($rec)
}

$cmd.CommandType = CommandType.StoredProcedure
$cmd.CommandText = "dbo.import_people_multivaluesdata"

$cmd.Parameters.AddWithValue("@p_cn", "cn1")  | Out-Null
$cmd.Parameters.AddWithValue("@p_attributename", "multiattr")  | Out-Null
$cmd.Parameters.Add("@p_multivalue_list", [System.Data.SqlDbType]::structured) | Out-Null
$cmd.Parameters["@p_multivalue_list"].TypeName = "multivalue_list"
$cmd.Parameters["@p_multivalue_list"].Value = $product_list

$ReturnRowCount = $cmd.ExecuteScalar()

And there you have it.  With one call to a stored procedure, all current multi-value attribute entries for a specific attribute will be deleted and completely replaced with the new list of values and is scalable to any number of entries in the list.

[1] You can tell the difference between my own SQL code for the procedure and the GUI-generated code for the UDT.  This is how I was taught to write SQL code.  I see very few other samples that follow the same layout.  However, this is the only layout that gives you complete control of the ordering of table names or column names where you don’t have to worry about forgetting to have correct punctuation.  All the column names in the select statement are nicely lined up, and you can easily verify all subsequent columns begin with a comma.

No comments:

Post a Comment