Daniel Steward

On implementing SCIM, in practice

At Hut Six we’ve been operating a SCIM API for over four years and it’s been a great time saver for us. I thought I’d share some of the considerations and decisions that went into building this out.

Why SCIM?

Before SCIM, user provisioning (i.e. getting a customer’s users and groups into our platform) at Hut Six was always painful and usually involved tedious CSVs. This would inevitably hold up customer onboarding for any reasonably sized organisation.

Updates to users/groups could be even more painful. In theory, customers simply needed to upload CSV containing all their users and group assignments and this would create new users and remove absent users automatically. But in practice, these CSVs would often be fraught with errors and required extensive checks and amendments.

SCIM solves this problem by offering a standardised way to sync users and groups from an Identity Provider (Idp), such as Microsoft Entra, to a 3rd party Service Provider like ourselves at Hut Six.

With Microsoft 365 and Entra practically standard in medium to large enterprises, a SCIM integration is a quickly becoming a must-have for SaaS platforms.

SCIM overview

SCIM operates over HTTP using JSON; the Service Provider must implement a set of HTTP endpoints that the IdP can send requests to, as and when changes to users and groups occur.

I won’t go into detail on the ins and outs of the standard as that is comprehensively covered by the two RFCs on the SCIM schema and protocol:

You might also find Microsoft’s and Okta’s explanations enlightening:

One thing to note is that you almost certainly want to implement the “Enterprise User Schema Extension”, as Microsoft Entra will send that information over by default.

Grappling with the real world

Now, the interesting part: how do organisations actually interact with SCIM?

IdP integrator vs the internal product owner.

This is probably the most important driver of requirements, so if you’re skimming then read this!

The person responsible for setting up SCIM inside a customer organisation may not be the same as the person with internal ownership of your SaaS platform. Certainly with the customers we saw at Hut Six, these are almost always different people.

The IdP integrator is usually someone who manages or helps administrate the entire user directory and any number of other systems too. They want to get the integration with your platform set-up quickly without making changes to their directory and then, ideally, never have to touch your platform again.

The product owner, meanwhile, wants to make sure that the users have access to what they need, that the list of users being synced is accurate, and that any information useful for reporting is also synced across.

There is a natural tension in this professional relationship and if you decide to shift too much responsibility to the SCIM integration, you risk disempowering the internal product owner and forcing them to go through the IdP integrator for any meaningful change.

For example, for us this meant keeping user role assignment within the Hut Six platform. Though SCIM can sync this information across, it is something that the Hut Six product owner should be responsible for.

Directories are often unorganised, filled with service accounts and hard for customers to improve

From our previous experience with syncing users from on-prem Active Directory, we knew that user directories are typically far from perfect. Introducing a meaningful structure to a directory of thousands of users and then keeping it up to date is no small task, and we didn’t want it to prevent our customers using SCIM.

This means:

  • We can’t rely just on group membership or attribute filters to scope the users accurately.
  • We should assume that we will receive both genuine users and service user accounts to our SCIM API.
  • Having service accounts on the platform is highly undesirable: they take up licences and skew metrics.

Therefore, we’ll need a way to selectively exclude users from being present on our platform, outside of the IdP.

Data Retention Requirements

The data retention requirements for our platform could well be longer than for the IdP, so having a way of opting out of hard deletes entirely is useful.

Switching existing customers from CSVs to SCIM

Customers will need a clear, easy to grasp path to switching from manual user maintenance to using SCIM. They will want to test it first, and ensure that there is no interruption of service for their users.

Two features can smooth over these concerns:

  • A DRY-RUN mode: so that customers can see exactly what is being received by our platform over SCIM before going LIVE
  • Prevent accidental deletions by ensuring that existing user accounts are not exposed to the SCIM API until they are also being synced from the IdP.

While the latter is helpful during the switch-over, it does risk old user accounts continuing to linger in the system. So, a way to identify users that were not synced by SCIM is very helpful as part of a pre- or post-switchover cleanup.

Putting it together

Put all this together and the onboarding/switchover process looks like this:

  1. IdP integrator performs initial set-up of the SCIM connection with the Service Provider set to DRY-RUN mode.
  2. The initial sync occurs and populates a test environment.
  3. The IdP integrator and internal product owner checks that the list of users and groups matches expectations.
    1. They can run a report that identifies any users not included in the SCIM sync (matching on e-mail address).
    2. At this point they can exclude any errant service accounts manually.
  4. Product owner checks that any non-technical integration settings matches their requirements. E.g. hard user deletes.
  5. Once happy, the product owner switches to LIVE mode. The sync now takes effect: existing accounts are taken over by SCIM, any new accounts are created and some accounts may be disabled (by the active attribute). No accounts should be deleted at this point.
  6. With the integration now in place, users and groups are automatically kept up to date.

Our solution: Two-stage sync

In order to hit these requirements we opted for a two-stage sync with the SCIM API syncing to shadow Users and Groups database tables.

In DRY-RUN mode, these users and groups stay in the shadow tables. Only once LIVE mode is enabled, does the system sync the SCIM entities to the platform user and group entities. For existing users on first sync, the system will match on e-mail addresses. It won’t attempt to match to any non-SCIM created groups.

This set-up easily allows for selective exclusion of service accounts from with the Hut Six platform. An isExcluded field is maintained on the SCIM user entity and if true, the entity is skipped for the 2nd stage sync.

Another advantage of this architecture is that it provides an easy way to ensure that what you expose to the SCIM API exactly matches what was sent across. This is vital, as otherwise the IdP will keep issuing updates to your SCIM endpoints.

Some final implementation notes

  • Microsoft API quirks required custom deserialisation for the SCIM endpoints, specifically because the manager attribute is implemented in a non-standard way
  • SCIM has extensive querying and filtering support, but very little of it is actually needed
  • If you aren’t a gallery app, Microsoft Entra won’t throttle requests and customer changes often impact a large percentage of users at once, so be prepared to handle very bursty traffic on these endpoints.

On Blazor and front-end development with .Net 9

Planning a wedding has given me the perfect excuse for a project to try out some different tech. We wanted a simple, private website to help us share information and collect RSVPs.

Vue and Asp.Net MVC are my bread and butter for front-end, but I’ve been wanting to try out Blazor in a Server Side Rendering set-up for quite some time.

The experience was sadly a disappointing one: hot reload is essentially non-functional (at least when using Rider), massively slowing development, and it also misses a number of niceties that Vue supports.

The most surprising was a lack of built-in support for specifying additional classes for components at the point of use, like so:

<ButtonComponent class="mt-2 mb-4" ... />

When composing small components together, being able to easily specify additional classes is incredibly useful and eliminates pointless wrapping elements that often get in the way.

You can, of course, add support for this yourself in a component, but I think the issue is emblematic of the shortcomings and niggling issues that are currently present in Blazor.

I was also disappointed to discover that Blazor uses a synchronization context to keep requests on the same thread. A website under load would take a hit to latency as after awaiting, it has to wait for the original thread to yield back to it in order to continue rather than being able to use the first available thread.

The other Blazor modes hold limited appeal for me, either requiring a constant connection with or a very large payload - so I don’t see myself reaching for Blazor again any time soon.