Tutorial

Tutorial: Creating privacy streams and running your own local demo from our CLI.

5 min task

Inside our CLI, you can run a quick demo yourself in less than 5 minutes. We'll create a stream, set up the privacy stream and simulate sending and reading data back - all from your terminal.

by The Stream Team on

Intro

One of the key observations in founding Stream Machine was that true privacy-by-design in data and at scale is hard to achieve.

We think we made it much simpler. To quickly see this for yourself we included a CLI command to easily simulate sending events and reading a privacy stream. This post will guide you through the steps.

We assume you have created an account and you have our CLI installed and running. Everything in this demo will be done from the command line.

Scope

The new sim run-random command helps you to quickly see what is happening to your data. We will:

  • Authenticate against Stream Machine
  • Create a stream
  • Create a derived stream to set consent levels
  • Start a simulator to send data
  • Read a privacy stream on that data with specific consent levels.

1. Authenticate to Stream Machine

First, authenticate to our platform with the auth command:

#authenticate
❯ strm auth login [your@email.com]
❯ Enter password: ***

This will set up the necessary tokens to safely communicate with our API’s.

2. Create a new stream

Creating a stream is done through the create stream command (we said ‘simple’ after all ;-) )

#create stream
❯ strm create stream winston --save

You have now created a “stream” (hence the command), which you can think of as a pipeline through which your data flows.

Note: Upon stream creation, your terminal prints a set of credentials. It’s important to securely store these in a file somewhere to use inside your apps (we recommend an encrypted file or inside your password manager).

You cannot access your data in any other way other than with these credentials.

If you don’t want to store the stream settings on disk, you can omit the --save flag.

Creating a privacy stream

To read back data you create privacy streams, which applies additional config for further processing (like under what consent levels the data will be collected). You can think of a privacy stream as an egress endpoint that contains nothing but data you are allowed to use under a specific consent.

Context

For our purpose here, we will assume you have two consent levels under which you collect the data: 0 - basic and 1 - personalized, and that you’re sending events from a situation where a user is logged in (and so you can have a customer ID for purposes of offering that service).

Create a stream

Let’s first create the privacy stream for the 0 - basic level.

#create the 0 - basic privacy stream
❯ strm create stream --derived-from winston --levels 0 --save

If you want more fine grained control, set the levels and consent-type explicitly:

#create derived streams of type granular with consent levels 0
❯ strm create stream --derived-from winston --levels 0 --consent-type GRANULAR --save

As 0 is the lowest level, it means those event contract fields that are sent with consent 0 or higher, will be decrypted in the winston-0 stream.

For purposes of this post, we’ll also create the 1 - personalized consent level:

#create derived stream with 2 consent levels
❯ strm create stream --derived-from winston --levels 1 --save

Note: The difference between --consent-type cumulative and granular is laid out in the docs.

Start a simulator and send data with sim run-random

To start a simulator and send some dummy data, you can use the built-in command sim run-random on your stream. The simulator uses the generic and simple clickstream demo event contract.

If you did not set the --save flag before, make sure to pass the client-id and client-secret):

# Run a simulator and send data
❯ strm sim run-random winston
# Or without the save flag
❯ strm sim run-random winston --client-id [string] --client-secret [string]

Note: If you are just testing the pipeline and don’t want to spam your terminal with event prints, --quiet has your back.

Read and inspect data

With your terminal sending events, the real proof of the pudding is in seeing what’s passing through the privacy streams.

Fire up a new terminal window CTRL | CMD + T and run strm egress winston-0:

#Read the privacy stream for stream winston consent 0 - basic 
❯ strm egress winston-0

{"strmMeta": {"schemaId": "clickstream", "nonce": -1890771136, "timestamp": 1626101758721, "keyLink": "81f112bb-05fe-4f06-942b-6ec012eb7c39", "billingId": "demo4678984730", "consentLevels": [0]}, "producerSessionId": "AVba3kpDl1nmFkB4OPykJYbrvSDCfS4OKMLcP21eYIk=", "url": "https://www.streammachine.io/rules", "eventType": "", "referrer": "", "userAgent": "", "conversion": 0, "customer": {"id": "customer-session-593"}, "abTests": []}

This basically reads from a simple WS interface the data that is being sent over the winston stream with consent level 0 (remember, that’s the safest privacy stream).

As we set different consent levels for each of the two privacy streams, you would expect to see a difference in the egressed events:

#Read the privacy stream that is safe to use under consent level 1
❯ strm egress winston-1

{"strmMeta": {"schemaId": "clickstream", "nonce": 966067871, "timestamp": 1626101816930, "keyLink": "c123e38a-6715-4ae4-bfe6-41168eca98f2", "billingId": "demo4678984730", "consentLevels": [0, 1]}, "producerSessionId": "session-327", "url": "https://www.streammachine.io/rules", "eventType": "", "referrer": "", "userAgent": "", "conversion": 0, "customer": {"id": "customer-session-327"}, "abTests": []}

See what happens? As the producerSessionId field needs consent 1, it remains encrypted if the event we receive includes a different consent. We apply additional processing (like key rotation) to warrant privacy inside the data.

What did we just do?

  • We authenticated against Stream Machine and created a new stream
  • We created a derived stream to set consent levels
  • We read a privacy stream on that data with specific consent levels from a built-in simulator command sim run-random on the input stream.

This is just a very simple example to create your privacy streams and see what is happening to the data underneath, please reach out if you want to learn more and have elaborate use cases!

PS We’re hiring! Want to join in on the fun?

Do you care deeploy about engineering for privacy, want to contribute to tools like our CLI and all the engineering we are abstracting away? We are hiring!

Do you want to build faster and cheaper without worrying about privacy?