Session Authentication with Lambda and DynamoDB
In this tutorial, we create Session Authentication using AWS Lambda and DynamoDB. We go over what Session Authentication is, why we use Lambda for it, and build it from scratch. We also go over testing, packaging, and deploying the Lambda functions using the Serverless Application Model (SAM) framework.
What is Session Authentication?
You might have already guessed it, session authentication is a type of authentication which is one of the most widely used kind and one of the easiest to implement.
How Does It Work?
When a user enters their credentials and submits a request to login, the backend first checks if the credentials are valid and if they are, a random string is generated. This randomly generated string is our session token.
This string is then stored in the database along with some other data that is required such as the User ID. Let’s call this string a token because that’s what it is, a token to get access to a set of services. This token is then stored on the client-side as a Cookie which is sent on every subsequent request to the backend of the application.
The following things happen when a user sends a request to the API:
- The request is sent to the server which contains the cookies
- Backend parses the cookies and gets the session token
- The session token is validated and if valid get the session data by sending a request to the database which stores the session token
This is the simplest version of session authentication we can implement.
Note, the database can be of any kind. However, since the session info is read very often, it’s useful to store it in databases that are built to have extremely fast read speeds.
But Why Use Lambda for This?
You may be wondering why we need to use Lambda for session authentication when the logic to implement it is not so complex and requires minimal effort. The answer to this, like a lot of other questions today, is microservices.
For a monolithic application, using Lambda would be counter-productive as you often have a single codebase where all your logic exists. There could even be dips in performance because you’ll be sending a request to the Lambda Function every time a user sends a request.
But, in a modern-day application, there are often tens and hundreds or even thousands of microservices, and reimplementing the logic required to authenticate a user in each service can get quite cumbersome to write and maintain.
The most common rule followed by developers is probably DRY (Do not Repeat Yourself) and that is exactly what we’re trying to achieve. With Lambda, our flowchart would like something like how it does below, where all the logic is in one place and maintenance becomes much easier. This also creates an extra layer of abstraction which could be very useful.
Important: this doesn’t mean you should always strive towards DRY code. Here’s a great talk on WET code (the opposite of DRY code).
What About JWTs?
If you don’t know, JWT stands for JSON Web Tokens. It’s another kind of authentication that has gained immense popularity and adoption in recent years.
The main advantage is that the JWT is cryptographically signed and the session data is stored in the token itself, which means the backend doesn’t have to send a request to the database every time a user sends a request, potentially leading to better performance in a lot of cases.
However, it’s not all bells and whistles with JWTs.
Disadvantages of JWTs
- Harder to implement something secure. Here’s an article describing the harder bits and pieces required to implement with JWTs without which it isn’t secure.
- Limitations in the data that can be stored. Sensitive data regarding the user cannot be stored in JWTs as this data would then become public.
- Features that require you to know which devices the user is logged in on are not possible. For example, if you want to show the user all the devices they’re logged in on and log out on one device from another.
Prerequisites
- An AWS account. Don’t worry, you will not be billed for anything as we’ll be using DynamoDB and Lambda which can be used for free with certain limits.
- Node and NPM
- Docker — To run our Lambda Functions locally
- Your editor of choice
If you’re comfortable with using the AWS CLI for the below steps you can do so. However, we’ll be working directly with the AWS Dashboard as it’s simpler to get started with.
Make sure you’re logged in to an AWS account that has the required permissions to use DynamoDB, Lambda, CloudFormation, and S3. If you’re using a personal account or a root user account, you don’t have to worry about the permissions and can move forward.
Create a DynamoDB Table
First, log in to your AWS Console and head over to DynamoDB. Then, click on Create Table.
DynamoDB is a schemaless NoSQL database. It’s a hybrid of a document database and a key-value store. We’re using DynamoDB because it’s serverless which means we have almost nothing to manage, it’s extremely fast and reliable, and since it’s schemaless we can store unstructured data.
If you’ve never used a NoSQL database like DynamoDB or MongoDB before and are coming from using a traditional relational database like PostgreSQL, I’d suggest reading this section from the AWS docs to learn more about how they compare and how they work.
When we click on Create Table, we’re taken to the below page.
Let’s analyze the different components that go into a DynamoDB table.
- Table Name — Quite literally, the name of the table
- Primary Key — Just like how in a relational database we have a Primary Key to identify a certain record, we have the Primary Key in DynamoDB.
If you look closely, you’ll also notice something called a Partition Key and a checkbox for a Sort key.
The primary key is made up of a partition key (hash key) and an optional sort key. The partition key is used to partition data across hosts for scalability and availability. Choose an attribute which has a wide range of values and is likely to have evenly distributed access patterns. For example CustomerId is good while GameId is bad if most of your traffic relates to a few popular games.
The sort key allows for searching within a partition. For example, an Orders table with primary attribute CustomerId and sort attribute OrderTimestamp would allow for queries for all orders by a specific customer in a given date range.
At this point if you’re confused about the terminology used, don’t worry, you’re not alone. The naming of the partition and sort keys are linked to the inner workings of DynamoDB and how it uses the two to distribute and store data and you don’t have to know much about what they are. However, for the curious ones out there, you can check out this section of the AWS docs which explains the two in much more detail.
Let’s move forward by naming our table UserSessions
and primary key sessionId
which is of type string
.
We’ll be sticking with the default settings as that will help us get started quickly and we don’t need to modify anything to get up and running with our application.
Finally, click Create.
Working With DynamoDB
The API to work with DynamoDB is fairly simple. You can execute commands using the REST API, the AWS CLI, or by using the DynamoDB SDK. We’ll only be covering the basics of working with DynamoDB in this tutorial and won’t be going in-depth, but as always feel free to explore the documentation if you want to learn more about working with DynamoDB.
In DynamoDB, an item is a collection of attributes where each attribute is a key-value pair and the value can be a scalar, a set, or a document type (documents are similar to JSON objects). To put it in simple words, an item is a record with multiple properties that are stored as key-value pairs. Each table has multiple items, and each item has multiple attributes.
To work with the data in DynamoDB we make use of operations. Operations are commands we can use to modify data in our DynamoDB table. There are four main operations for Create, Write, Update and Delete (CRUD) functionality, namely, PutItem
, GetItem
, UpdateItem
, and DeleteItem
.
Writing Data
To write data to a DynamoDB table we make use of the PutItem
operation.
{
"sessionId": { "S": "abcd-abcd-abcd" },
"userId": { "S": "dcba-dcba-dcba" },
"timestamp": { "N": 1612969254 },
"isActive": { "BOOL": true }
}
If we perform the PutItem
operation with the above input on the table we created, a new item gets created in our table with the sessionId
set to abcd-abcd-abcd
, userId
set to dcba-dbca-dcba
and timestamp
set to 1612969254
. Remember, we set sessionId
as our primary key and of type string, so the value of this field has to be unique and of type string, else an error will be thrown.
But what are S
, N
and BOOL
?
That’s the data type of the value we’re providing. S
stands for string, N
for number, and BOOL
for boolean. You can find the full list of all the available data types along with their constraints in the official documentation.
Updating Data
Similar to the PutItem
operation, we use the UpdateItem
operation to update data in our table. However, the UpdateItem
operation works a little differently. We have to provide the primary key of the item whose data we want to modify and provide an update expression. An update expression specifies which attribute’s value to modify.
Suppose we want to modify the isActive
attribute of the item we just created in the previous section, our update expression would be SET active = :activeStatus
, where :activeStatus
is a placeholder for the attribute value which we pass using the update expression attributes argument.
The update expression attributes argument will look similar to the input we provide to the PutItem
operation like below.
{
":activeStatus": { "BOOL": true }
}
We specify the placeholder key and the value of that placeholder along with the value of the type.
Reading Data
To read data we provide the primary key of the item we want to retrieve to the GetItem
operation.
So far, the operations we’ve talked about and how we use them may seem a little vague as we haven’t run them yet. Hopefully, it becomes more clear when we write some code in the next section where we work with Lambda and DynamoDB.
Creating the Lambda Functions
Unlike how we worked with DynamoDB directly from the AWS Dashboard, we’ll be working solely from our text editor and terminal when working with Lambda. We’ll even be bundling and deploying the Lambda functions directly from our terminal which we’ll look at in the next section.
Functions We’ll Be Creating
- Create session
- Validate and get session info
- Deactivate session
Project Setup
- Create three separate folders named
create-session
,get-session-info
, anddeactivate-session
, one for each function. - In each folder run
npm init -y
to initialize NPM - All the Lambda functions will need to interact with DynamoDB which we’ll be doing using the AWS JavaScript SDK. We’ll also be needing the
@aws-sdk/util-dynamodb
package which contains utility functions that make working with the SDK easier. To install the SDK and utility library, runnpm i @aws-sdk/client-dynamodb @aws-sdk/util-dynamodb
in each of the folders.
Create Session Function
For the create-session function, we’ll be using an additional package called crypto-js
which contains functions that use different algorithms to generated hashes. To install it, run npm i crypot-js
.
We first create a generateId
function that randomly generates a Session ID using the SHA256
hash function with the input being a concatenated string of the User ID, current timestamp, and a randomly generated number.
Then, we initialize an instance of the DynamoDB client.
And finally, we create the Lambda function handler whose input will be the user info such as the User ID and the output will be the session info which includes the Session ID, the expiry date, the active status, and the time it was created.
In the previous section, we discussed how we have to provide the data type of the value of the arguments. But notice how we’re not doing that here. Instead, we create a regular JavaScript object and pass that to the marshall
function from the @aws-sdk/util-dynamodb
package.
The marshall
function takes a regular JavaScript object as input, interprets the data type of each argument, and returns an object with the format expected by DynamoDB where the datatype of the value of an attribute is provided.
For example, if we provide the input { sessionId: "abcd-abcd-abcd" }
to the marshall
function, we get the output as { sessionId: { S: "abcd-abcd-abcd" } }
.
We call the .putItem()
method on the instance of DynamoDB
with the table name and the attributes of the item as input to perform the PutItem
command.
Validate Session and Get Session Data
Similar to the create-session function, we first initialize an instance of DynamoDB
. The input to the Lambda function in this case will be an object with a single sessionId
property and the output will be the session information.
We perform the GetItem
operation by running the .getItem()
method with the table name and key of the item we want to access as input, which in this case is the Session ID.
The structure of the response of the GetItem
operation is similar to the input we provide to the PutItem
operation, i.e. the datatypes of the attribute values are provided. However, we don’t want to deal with that as that makes accessing the data cumbersome. To remove the datatypes from the object we use the unmarshall
utility whose function is the exact opposite of the marshall
function. For example, if we provide the string { sessionId: { S: "abcd-abcd-abcd" } }
as input, the output would be { sessionId: "abcd-abcd-abcd" }
.
Before returning the session info we check if the session is expired, and if it is, we perform an UpdateItem
operation by running the .updateItem()
method and set the isActive
attribute to false and return the session info object. If it is not expired, we update the expiry date of the session to 14 days from the current time and return the session info object with the updated expiry date.
Deactivate Session
You might ask why we’re deactivating a session and not deleting it. Imagine after a session is created the user logs out, but before that, the user stores the session token that was created elsewhere.
Then, let’s consider we delete the token instead of deactivating it when the user logs out. Sometime in the future, however slim the possibility there might be, imagine a token is created that matches the exact token we had previously created and deleted for the previous user.
The previous user still has access to that token and if they try to access the application using that session token, they will get complete access to the second user's account.
To prevent the above scenario, we make sure the token will be unique till the end of time (literally). The simplest way to do this would be to have an active state and store whether that token is active or not. This way there’s a built-in constraint in the database preventing us from creating a duplicate session token including the ones which are deactivated.
Similar to the validate and get session data function, the input to this function will also be an object with the Session ID as a property. We then run the UpdateItem
operation and set the isActive
attribute to false and return the updated session info.
We also set the ReturnValues
attribute of the UpdateItem
command to ALL_NEW
, which is telling DynamoDB to return all the attributes after updating the item.
Testing locally and deploying with SAM
The Serverless Application Model (SAM) is a framework that helps us build, test, package, and deploy serverless applications. In our case, we’ll be using it to test and deploy the Lambda functions we just created.
How SAM works
To test Lambda functions locally, SAM creates an execution environment using Docker and executes the function based on the SAM template. To package and deploy the functions, SAM uses S3 and AWS CloudFormation under the hood.
SAM Template
The SAM template is a YAML file that gives it the information it requires such as the functions runtime and where the code for our functions is situated.
At the top of the template, we provide the description of the application and some basic information that is required by CloudFormation which we won’t have to worry about.
Under the Resources section, we have the three functions we just created which are all of type AWS::Serverless::Function
, i.e. a Lambda function.
We then provide the following properties to each function under Properties
:
CodeUri
— The relative path of the directory which contains the codeHandler
— The Lambda handler function, which in our case isindex.lambdaHandler
, i.e. thelambdaHandler
function which we exported inindex.js
.Runtime
— The execution environment of the Lambda Function which will benodejs12.x
.
Setup AWS Credentials
To authenticate with AWS, we’ll need our access key ID and the secret access key. You can get them by clicking on My Security Credentials under your username in the AWS Dashboard. In the Your Security Credentials page, under the Access Keys section click Create New Access Key. Make sure to note down or download the secret access key as it won’t be visible again.
You can set up your AWS Credentials using the AWS CLI, a credentials file, or environment variable. To set your credentials using environment variables, run the below commands in your terminal.
$ export AWS_ACCESS_KEY_ID=your_access_key_id
$ export AWS_SECRET_ACCESS_KEY=your_secret_access_key
Invoking the Functions
To start a dev server which will be the endpoint for calling our Lambda functions run sam local start-lambda
. By default, this command will start a server at localhost:3001
.
Let’s say you have a /login
endpoint. Once you verify the credentials the user entered are correct, you would invoke the CreateUserSessionFunction
with the user info as input and then store the sessionId
from the output in the user's cookies. This way, every time a user sends a request to your backend, you can get the sessionId
from the cookies and call GetSessionInfoFunction
with that sessionId
as input and get the session info.
To invoke the function using the AWS JavaScript SDK for node, you can use the code below. It invokes the GetSessionInfoFunction
as mentioned in the SAM template file.
Deploying the Functions
Finally, to package the function run sam build
and to deploy it to AWS, run sam deploy
. It’s really that easy!
To invoke the functions in deployment, remove the endpoint from the above example and you should be good to go!
Conclusion
There are numerous ways we can go about session authentication and as mentioned before, using the approach we just used with Lambda and DynamoDB is not for every use case, especially not for monolithic applications.
The great thing about our approach, and of serverless in general is that it’s production-ready from day one. And with much less effort, our approach is just as secure as some of the more secure methods of using JWTs.
What Next
- You can try using an in-memory database like Redis for the best performance.
- Use a
.env
file or a credentials file to store your AWS credentials so that you don’t have to set the environment variables every time.
The final code for the tutorial can be found on GitHub at shreyas44/session-auth-tutorial