Flow on MS SQL - Extended

This is the second part of the training "Flow on MS SQL Server". In this training we will cover hierarchical data, stored procedures, Data Tree portlet and Row Level Security

We haven't yet recorded a video for this second part of the SQL training since we are trying out some new grounds around our training model. If you liked having a video in the Basics part of the training, please send an email to university@novacura.com describing what you liked or not about having a video together with the online training material.

Video: see above Estimated length of training: 4 hours

Prerequisites:

Flow and Flow portal 2 with AdministerWorkflow permission

The logins, users, tables and views that we created in the previous part (Basics), if you do not have them you can start by adding them by first in the master database execute the code under the "Master" tab here below, and then move over to your training db and execute the code under "Your training DB"

ed if you haven't already done the previous excercises
--Use Master
CREATE LOGIN Mars WITH password='2ABYlabiRdWckJsSnOxd';
CREATE LOGIN Moon WITH password='dNADM2vDJ1MdnzLlLBCp';

SQL version 2008 or later for HierarchyId, and the database compatibility level set accordingly. You can check your SQL server version with

select @@VERSION

And to find out some more information about license etc this query can be useful

SELECT SERVERPROPERTY('productversion'),
 SERVERPROPERTY ('productlevel'),
 SERVERPROPERTY ('edition')

Some features have changed in the versions after 2008 as well, so, if possible, try to be on one of the later SQL Server versions.

Parent/Child and Hierarchies

We are going to continue with our task planning tool assuming that Nasa does not only embark on simple tasks, but also some more complicated endeavors that would have to be broken down in into smaller tasks, that in turn also might need to be broken down, etcetera. Let say you found this sketched on a piece of paper dwindling down from the sky after a test rocket blew up during re-entry above your house:

  • Go to Space

    • Build rocket

      • Develop engine

        • Procure engine material

      • Test engine

      • Build fuselage

        • Procure fuselage material

    • Go to Mars

      • Hire Mars astronauts

    • Go to the Moon

      • Hire Moon astronauts

And you decide to help Nasa go digital.

There are many examples of where a hierarchy like this is useful, perhaps the most obvious around ERPs is the Bill of Materials (BOM), but also other things like organization charts and aggregating up results from Key Performance Indicators (KPI), in the organization tree.

SQL Server provides us with a datatype especially for this called HierarachyId. We will now take a closer look at how to make use of this.

Remember how we added two columns to our Task table called ParentTaskId and Hid

It might not be expected for systems outside of SQL server to fully understand the datatype HierarchyId (e.g. our own portal), so why not just keep a parent / child structure alongside since we control the data.

We will use "stored procedures" to manipulate our data, which is also a good segue over to how they work in SQL.

SQL Insert into

We do need a root node to work with, so create our top level task with this code:

Insert Task (HId, TaskName)
select 0x, 'Go To Space'

where 0x is the hierarchy for the root, and ParentTaskId shall be NULL.

Stored Procedure - NewTask

Stored Procedures in SQL Server are snippets of code that can be executed on demand, and they might or might not contain input variables (and/or output). Stored procedures are also referred to as sp, proc, or sproc.

To just illustrate the simplest form of a procedure:

Create procedure ProcDemo 
as
SELECT *
  FROM [Task]

Then execute the procedure:

exec ProcDemo

As you see, the procedure executes the select code from above. This can sometimes be used to trick Flow into running code you know is ok, but our parser find appalling. Now drop our dummy procedure:

 drop procedure ProcDemo

Procedures can either be code that SQL just executes (like the example above) or it can have input and/or output parameters that we will look at next, and we will also implement some more logic into the procedure itself. You do not need to fully understand the procedure here below, the main objective is to understand the usage of stored procedures and how to use Flow to interact with them.

Input variables are values being passed into the procedure. In our case, it will be where in the tree the tasks should end up (@Parent) and the name of the task (@TaskName).

Output variables are values that can come out from our proc, e.g. the result from the execution.

Internal variables are values that are being fetched or modified within the proc. We will prefix internal variables with @c_ to indicate to ourselves that it is an internal variable used inside the procedure.

Executing Stored Procedures with input variables can be executed by passing the values in the exact right order as they are specified in the proc (exec NewTask2 13 , 'Test') or by assigning each variable a value (exec NewTask2 @Parent = 13 , @Task2Name = 'Test'). We will use the parameter’s name as a best practice, both for readability in Flow later, but also to give you the option to change the order of the variables. The downside is that we will need to declare the parameters in Flow and they might collide with existing variables in Flow (more on that later).

Create a stored procedure that we will use to create our tasks in order for them to end up correctly in our hierarchy tree using this code:

Create PROC [dbo].[NewTask](@Parent int, @TaskName varchar(200), @NewTaskId int = NULL Out)
AS
/*
2021-04-29: Created to create tasks

To test: 
NewTask @Parent =, @TaskName =
*/
BEGIN
 DECLARE @c_ParentsHId HierarchyId, @c_MaxHid HierarchyId
 SELECT @c_ParentsHId = HId
 FROM Task
 WHERE TaskId = @Parent
 SET TRANSACTION ISOLATION LEVEL SERIALIZABLE
 BEGIN TRANSACTION
 SELECT @c_MaxHid = max(HId)
 FROM Task
 WHERE Hid.GetAncestor(1) =@c_ParentsHId ;
 INSERT Task (HId, TaskName, ParentTaskId)
 VALUES(@c_ParentsHId.GetDescendant(@c_MaxHid, NULL), @TaskName, @Parent)
 COMMIT
END ;
GO

ToString() and IsDescendantOf()

Since the HierarachyId datatype in SQL is not so readable for a human there is a trick to make it easier to interpret that is called ToString(). And we have the IsDescendantOf() that checks the descendants of the node.

Let us alter vTask to include some more columns and make use of ToString() and IsDescendantOf()

Alter view vTask --or create if you do not alreay have it
as
/*
2021-04-29 olcase: created
select * from vtask
Select * from task
*/
Select TaskId, TaskName, ParentTaskId, Hid, Hid.ToString() HIdString
,(select count(*) from Task t2 where t2.HId.IsDescendantOf(t.HId) = 1) GroupSize
,Case (select count(*) from Task t2 where HId.IsDescendantOf(t.HId) = 1) when 1 then 'false' else 'true' end as HasChildren
from Task t
where Hid is not null

Now a select from vTask should look like this:

Note the TaskId from your result

Create the first sub-task by executing our stored procedure created above, assigning our parent (the TaskId of our root node, in my case it is 1, but if you have added other rows it might be something else in your database, so change the @Parent variable to whatever the Id is of your task with the name ‘Go To Space’) and the name of the task.

NewTask @Parent = 1, @TaskName = 'Build rocket'

Create the structure that we described above using your NewTask-Procedure. You might need to check vTask between them so that you put it under the right parent TaskId, no worries if they are not the exact same for you as in this example, as long as the parent/child structure is intact.

select * from vtask
exec NewTask @Parent = 1, @TaskName = 'Go to Mars'
--check vTask to get correct Parent
exec NewTask @Parent = 3, @TaskName = 'Hire Mars astronauts'
exec NewTask @Parent = 1, @TaskName = 'Go to the Moon'
--check vTask to get correct Parent
exec NewTask @Parent = 5, @TaskName = 'Hire Moon astronauts'
--check vTask to get correct Parent
exec NewTask @Parent = 2, @TaskName = 'Develop engine'
--check vTask to get correct Parent 
exec NewTask @Parent = 7, @TaskName = 'Procure engine material'
exec NewTask @Parent = 2, @TaskName = 'Test engine'
exec NewTask @Parent = 2, @TaskName = 'Build fuselage'
--check vTask to get correct Parent 
exec NewTask @Parent = 10, @TaskName = 'Procure fuselage material'

This should give us the fields we need, the size of the group, and if it has children. Except for that, the TaskIds' and ParentTaskIds' can differ your database should look something like:

Current result from vTask (TaskId and Parent might differ)

Note here how easy it was to see the number of children for the group, while if we would only have used parent/child structure we would need to build a CTE which is both slower and more complex.

If we allow ourselves to think ahead, we can see how easily we could see how much time or cost is planned and/or used, not only for each task but also aggregated upwards in the hierarchy.

​Portlet: Data Tree

To Visualize our tree structure we can use the data tree in the portal

Task Tree portlet

The listener for the tree portlet is called “On Tree Node Selection”. Create a table portlet that listens to our tree portlet and selects the node we click on and displays it.

Graphical user interface, application Description automatically generated

Hierarchy.IsDescendantOf()

An extremely handy function of HierarchyId is to be able to easily query all the descendants (and itself) from a simple query. To practice that, let us create another table portlet that listens to the tree portlet.

Select
t1.*
from vTask t1
left join task t2 on t2.taskid = {TaskId}
where t1.Hid.IsDescendantOf(t2.HId) = 1

Table portlet listening to the tree portlet

The reason for the self-left join is that we are sending TaskId as a variable, not as HId, so it is similar to do a subquery to ask for the HId of that TaskId, just a bit fancier.

Now clicking on a node in the tree portlet should show itself and all its children in the table portlet we just created.

Deleting rows and Raiserror

Now let us create a CRUD flow for the table portlet that allows you to update the TaskName of a task, and also allows for the deletion of tasks if they do not have children. We will also look at the SQL Raiserror functionality.

Hopefully, you know how CRUD flows work together with the portal, call the machine flow “CRUD Task” and it should look like this:

Code:

if @Type = 'Add'
Begin
 select 1
End
if @Type = 'Update'
begin
Update Task set TaskName = @TaskName
where TaskId = @TaskId
end
if @Type = 'Delete'
begin
 If @GroupSize > 1
 Begin
 RAISERROR ('The task has subtasks, please delete or move its children’s tasks before deleting this', 16, /*Severity */ 2 /*State*/);
 End else
 Begin
 Delete from Task where TaskId = @TaskId
 End
end

and of course, connect it to the table portlet like this:

Now if you try to remove rows (Delete and Update-buttons (to save changes)) from the table portlet for a task that has children (GroupSize > 1) it should throw this error at you:

This prohibits the user from accidentally deleting an entire tree, but rather the user would have to delete them in order from the lowest level in the hierarchy on up. Just as a precaution (and to demo "Raiserror" for this training).

(To instead allow deletion would include the IsDecendantOf function of the HiererachyId datatype, and delete itself and all tasks below)

Also change so that it’s only the task name that is editable from the portlet, as this is the only column for which we have built update support at the moment.

Stored Procedure – MoveTask

Let us also create a stored procedure that can move tasks around in the tree (so that the subtasks also will be moved with it, keeping the underlying structure intact).

Create PROCEDURE [dbo].[MoveTask](@TaskId int, @TargetParentTaskId int)
AS
/*
2021-04-28 olcase: created
To Test:
MoveTask @TaskId = ,@TargetParentTaskId =
*/
BEGIN
DECLARE @CurrentHierarchy hierarchyid, @TargetHierarchy hierarchyid, @TaskName nvarchar(500)
SELECT @CurrentHierarchy = HId, @TaskName = TaskName FROM Task WHERE TaskId = @TaskId ;
SET TRANSACTION ISOLATION LEVEL SERIALIZABLE
BEGIN TRANSACTION
SELECT @TargetHierarchy = HId FROM Task WHERE TaskId = @TargetParentTaskId ;
SELECT @TargetHierarchy = @TargetHierarchy.GetDescendant(max(HId), NULL)
FROM Task WHERE HId.GetAncestor(1)=@TargetHierarchy ;
Begin
 UPDATE Task
 SET HId = HId.GetReparentedValue(@CurrentHierarchy, @TargetHierarchy)
 WHERE HId.IsDescendantOf(@CurrentHierarchy) = 1 ;
 UPDATE Task
 SET ParentTaskId = @TargetParentTaskId
 WHERE TaskId = @TaskId
End
COMMIT TRANSACTION
END ;
GO

Note how we update HId with the IsDescendantOf-function (updating the task and all subtasks) and ParentTaskId with a normal update statement (updating just the task, its children will remain with the same parent/child values).

Since we do not expect all our end users to be SQL ninjas of your level, we might consider spoiling them with some user interface options to acquire the same result.

Since our table portlet is where we can remove tasks and edit the name of them, let us also add the possibility to create new tasks and move them in the same place.

​Executing Stored Procedures from Flow

Create a flow called Create Task that looks like this (with the green ring and FirstInput):

And add a user step with a static text and text input:

Insert step:

Note here that since we choose to work with naming the input variables in the procedure, we need to manually declare them in the machine step, or it will throw an error saying that it could not resolve the variable.

Another way around this as mentioned above, not having to declare them in the beginning, would be to just throw the values at the procedure (Exec NewTask @FirstInput__TaskId , @NewTaskName), but the benefits of specifying which parameter you are updating generally outpaces the negatives.

Connect our Create Task flow as a Custom button in the table portlet.

Create a new flow that we name MoveTask with the same green ring and First input step as above and then like this:

Connect the Move Task button in the same way as before

Voila, we now have a solution where the users can manage a "bottomless" hierarchy of tasks and visualize them in a tree and table portlet.

If needed we can now use our flows and portlets to rename tasks and create and/or move some of them so that the result looks something like this:

Another real-life usage of the hierarchy structure we are using this for is in a BI solution where we use an organization tree together with transactions where the history of the transactions should be unaffected by people moving around. If you stick the hierarchy id of an employee to every transaction the transaction’s hierarchy would remain under the right place in the tree, aggregating up correctly to the groups above even if organizational changes occur (e.g. people changing departments). Using the hierarchy datatype in this scenario also greatly improves the performance vs a classic parent-child structure when correct indexes are applied.

Row Level Security

As there will be different teams working on the Moon and Mars mission they should not see each other’s work. Setting permissions on entire tables and views won’t solve anything since they will use the same tables and views, just that they want to access different parts of them (i.e. different rows).

In a multi-tenant environment, you may want to limit access to data on a row level, which can either be done on a specific value in a column (typically CompanyId, this user should have access to all rows that have the value 10 in the column CompanyId) or we can use the hierarchy that we just learned about (the user that has access to /2/2/ shall see everything below).

Create RLSAccess table

For this we need a base table containing the access, let call it RLSAccess:

CREATE TABLE [dbo].[RLSAccess](
 [UserName] [nvarchar](250) NOT NULL,
 [Hierarchy] [hierarchyid] NOT NULL,
 [CompanyId] [smallint] NULL,
 [RowCreatedDt] [datetime2](0) NULL,
 [RowCreatedBy] [nvarchar](250) NULL,
 [RowUpdatedDt] [datetime2](0) NULL,
 CONSTRAINT [PK_RLSAccess] PRIMARY KEY CLUSTERED
(
 [UserName] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
) ON [PRIMARY]
GO
ALTER TABLE [dbo].[RLSAccess] ADD CONSTRAINT [DF_RLSAccess_RowCreatedDt] DEFAULT (getdate()) FOR [RowCreatedDt]
GORLSAccess table, where Hierarchy matches what we have in our Task-table

Create Row Level Security table function

Run this code to create the table function that will control the row level security, this will only be done once

CREATE FUNCTION [dbo].[fn_RowLevelSecurity](@Hierarchy hierarchyid)
 RETURNS TABLE
 WITH SCHEMABINDING
AS
Return SELECT 1 AS accessResult
 FROM dbo.RLSAccess
 WHERE @Hierarchy.IsDescendantOf((Select Hierarchy FROM dbo.RLSAccess where UserName = USER_NAME() Collate Finnish_Swedish_CI_AS)) = 1
GO

Run:

select * from task

and verify that you see all rows in the task table

Now we will enable the row level security for each table we want to apply it to. After this security policy is added on the table it has to be dropped if we want to make metadata changes to the table (e.g. add or modify a column) so I have included the first line commented here explaining how to drop the security policy, it could be a good idea to save your sql-script so that it easy to drop and re-enable the policy (when making changes to the table). Run this code

--DROP SECURITY POLICY RLSOn_Task
CREATE SECURITY POLICY RLSOn_Task
ADD FILTER PREDICATE dbo.fn_RowLevelSecurity(Hid)
ON dbo.Task
WITH (STATE = ON)

And again run:

select * from task

you should now not see any rows in the tasks table, even if you are the dbo.

Tab groups in MSSMS

Create a new query and choose to create a New vertical tab group

And in the new blank tab in your new tab group, change to log in with Moon

If needed be sure to navigate to your database (you might be in the master database)

--Use YourDb
select * from vtask

Then do the same thing for the user Mars, and running the same query now with different credentials should give you different results

And if we want to grant the first user access to everything? Easy right, just add a row in our RLSAccess table yes? Well maybe, but it could be so that since you to the server are the dbo, so just to validate what you are to the sql server run:

If it gives you dbo, enter dbo in the RLS table with access to the whole tree

And then running the query again should give you everything from root level and below. No logging off and back on needed, all changes are effective immediately.

To match this in flow you set up a connector to the database with no global user id on it, then the option then shows up for the user, so many users can access the Moon project using the same sql-user to validate against the database.

(If you are synchronizing users for Active Directory the syntax for setting a Connector Login during sync is: exec SetConnectorLogin('YourConnectorName', ‘Moon’, ‘YourSuperSafePassword’); )

SQL Indexes

Indexes play an important role for the performance of SQL server, and not least with hierarchies in large tables. Without diving too far down the rabbit hole of indexes (there are plenty of good resources out there, where the eminent page "SQL Authority" is a good start when searching for guides for SQL server issues, for example this best practice guide) we can think of indexes in SQL server as indexes in a book, if you want to read the book it is best to do it from page one and going forward. But if you are asking the book to just give you the page where it writes about the Moon, it would be faster to go to the last pages, scroll down to M and then see the word "Moon" followed by what pages include the word Moon (and in SQLs case, the sentences are actually attached to that row so you do not even need to look them up in the book, sort of).

Clustered and non clustered indexes

A clustered index is just a weird name for that you sort the entire table according to one or more columns. If you have an old person close to you, you can ask how the phone book (like Yellow Pages) was ordered back in the days when those existed. That is a perfect example of a clustered index where the index was on the last name, followed by first and middle names, and then one or more phone numbers and an address attached to that row.

A non clustered index is not the actual table, but a smaller copy of the table sorted on one or more other columns. Say you want to know all people living on Sesame street it would take you quite some time to read through the entire phone book, but having another small copy of it sorted (indexed) by address instead would make the task a lot easier, just as it would be for SQL Server. If you also would be interested in only how many with the last name "Bird" lives on that street you would add the column LastName as an "included column" in your non clustered index.

Hierarchy indexes

For hierarchies, we can work with "depth first" or " breadth first" indexes - meaning if the index should be quick to fetch all underlying data of a certain place in the hierarchy (depth, think of it as getting all personnel under a top manager, no matter how many layers down they are). Or if it should be quick to fetch all records at a certain level (breadth, getting all personnel that reports directly to one manager, but not the levels below them).

Showing SQL execution Plan in MSSMS

To visualize what the SQL Server query optimizer does you can press ctrl+m, or check this small button in the menu:

Include actual execution plan

Down in the results pane you will find a new tab that is called "Execution plan" where you can see how SQL Server went about fetching the required data. Executing it on vTask with just the clustered index on the task table will probably give us this:

Now create an index on the task table on the Hid column (a depth first index)

CREATE INDEX Task_Depth_First   
ON Task(Hid)

Now the query optimizer should use our index as can be seen here

Utilizing a depth first index

To read more about indexes on hierarchies, check out Microsoft's documentation.

SQL Functions, Cursor, loops and (audit) triggers are also things that can be handy to know about in regards to SQL server which we have not yet had the time to add examples of. Please let us know if that is something you would like expedited.

Last updated