Making bedav.org an Offline First Progressive Web App

Published in

The Startup

9 min readSep 14, 2020

Ever since I wrote my first post on the tech I used to build the bedav.org, I’ve expanded to Pune, provided an overview for the entire city and created a page to view all locations along with their overviews (total, available and occupied beds in those cities/districts) and made it into an offline-first Progressive Web App.

Here, I’ll be talking about how I turned the website into a Progressive Web App (PWA) while focussing on an offline-first experience.

But first, what is an offline first experience and a Progressive Web App?

Below is a talk from Google I/O 2016 by Jake Archibald on how he turned an existing PWA into an offline-first experience and compares the improvement in performance by making it offline-first every step of the way.

But if you do not want to watch the 45-minute video, here are the short definitions.

Progressive Web Apps (PWA)

Progressive Web Apps in simple terms are Web applications that provide the speed, features and reliability similar to or equal to that of native apps. A web application that qualifies as a PWA can be installed and used as a standalone app with a native app like experience.

PWA’s allow for easier access to your application, a much faster and responsive interface and in most cases allows users to access your application when they are offline.

Check out this amazing article to learn more about PWA’s.

Offline first

An offline-first website is a website that works offline (I think that was obvious lol). But it’s a lot more than just that. The website is made available offline by caching all assets and data. The result of caching all the assets and data is a uniform and fast user experience regardless of whether the user is online, offline or has a bad internet connection.

To read more about the offline-first architecture checkout this article.

To turn bedav.org into a PWA with an offline-first experience, the following things had to be done:

Reduce the initial render times of the page
Caching assets that render the page
Cache the queried data

Reducing the initial render times

The initial render times were around 5 seconds. The initial render times were incredibly high since all the hospitals were being rendered during the initial render.

The most common solution was implementing pagination with infinite scroll. But if that was done, it would mean sending a network request to the server every time the user reaches the bottom of the page. For many applications, this is not a problem. However, when the data for all the hospitals was only 25KB, sending numerous network requests to fetch data of only about a kilobyte is just too inefficient. Rather, a better solution was to fetch all the data at once and render it as the user scrolls down. This decreased initial render times to only 1–1.5 seconds!

Caching Assets

To cache assets there were two options. Either using regular HTTP caching or Service Workers along with the caches API.

Service Workers along with the caches API was the obvious choice here as it’s reliable, predictable, persists until you delete it and can be accessed even when the user is offline. For a detailed comparison and explanation of the two check out this article on web.dev.

What are service workers though?
In short Service Workers are scripts that run in the background, independent of your main website. They enable a lot of things that were previously only possible with native apps such as sending push notifications, intercepting network requests, caching assets or background sync. Here’s a great article explaining service workers and walking through its basic usage.

The most common practice is to cache the assets required for the landing page and when the user navigates to another page, only then cache the assets required for that page (dynamic cache).

But in the case of bedav.org, there were only 4 different views/pages. The landing page showing all the locations, the locality view which shows the hospitals in a certain location, the hospital view which shows all the information regarding the hospital and the about page. In total all the assets required to render these pages were only about 2 MB, so I decided to just cache all of them when the user first visits the website itself or rather during the install event of the service worker. Even the fonts and the icons used on the page are cached when the service worker is installed.

All the javascript bundles are named using a content hash, which means even the slightest change to the code and the bundle name is changed. Since I decided to cache all assets required by the website during the install event, the service worker needed the names of the bundles generated by webpack. To get the names of these bundles and use them in the service worker, I used the serviceworker-webpack-plugin.

Every time the user visits the website, the browser checks if there’s a change in the service worker and if there is a change it installs the new version. Since the bundle names are injected into the service worker by the serviceworker-webpack-plugin, the contents of the compiled service worker change every time I modify my code. Hence, the latest version of the website is always available to the user.

Versioning the cache is one more thing that had to be done because we don’t want older assets that are not going to be used anymore taking up space on the users’ device. There were many options for this. We could either version the cache based on the date or limit the size of the cache. But in either case, there are chances of overusing the cache. Since the main bundles’ name includes a content hash, I just decided to use that content hash as the name of the cache for that particular version.

Caching the Data

For caching to work with GraphQL APIs we have to query for the same data. All the variable values and fields requested have to be the same. Even if the value of a variable/argument is slightly different we cannot retrieve it from the cache as the results would most likely be different.

If the user sorted by distance and searched for ‘s’, the query would look something like:

query {
  hospitals(lat: 1.12334, lon: 1.12345, orderBy: DISTANCE, descending: true, searchQuery: 's') {
    name
    distance
    ...otherFields
  }
}

Now, the moment the user types another character in the search box another request is sent with a brand new query. If the user decides to sort by the available beds in the ICU, another request is sent. We do not want to cache every single query we send. Notice that we pass the latitude and longitude in the query arguments and the distance is calculated on the server side. This means if the user moves even slightly and coordinates change by a really small value, the query would change and we would not be able to retrieve it from the cache.

Localizing Operations

To solve these problems, the number of arguments had to be reduced or they had to be eliminated. To do this we would have to get all hospital data at once and calculate the distance and perform all operation related to sorting, filtering and searching locally.

But, if we localized these calculations and operations wouldn’t that slow down the website?
Based on testing, no it doesn’t. Since we have only a maximum of 300–400 hospitals per location, there isn’t much data to process. In fact, after localizing these operations, the results of searching, filtering and sorting were shown instantaneously (literally)! Now even if the user has a bad internet connection or even no internet connection, they can still search, filter and sort through the hospitals without giving up on speed.

To do this I created a separate context whose value is the hospitals that need to be displayed, including the order and the HospitalList component takes care of rendering them (including the pagination strategy mentioned in the reducing initial render times section).

Now, querying for all the hospitals would look something like this

query {
  hospitals {
    name
    latitude 
    longitude 
    # the latitude and longitude of the hospitals to calculate the
      distance
    ...otherFields
  }
}

Now the query is going to remain the same even when user moves around, searches for a hospital or even filter and sort through them.

Switching from Relay to Apollo

The next step was to enable caching. At the time I was using Relay as my GraphQL Client and implementing a good and efficient caching strategy became a pain. Here are just some of the downsides of Relay with my limited experience with it:

Community is pretty small and support is lacking
Overly complicated concepts
Documentation is subpar

You can cache queries via Relay but, it persists only in memory and only for a certain period. I was looking for something that could persist the cache in the local storage and be able to use it the next time the user visits the page no matter when it is. Persisting it in the local storage would also mean that the user would be able to use it online.

Due to the above mentioned reasons, I decided to make the switch to Apollo. First thing I did was go to the Apollo documentation and you could already see the difference. The documentation is so much better in every way!

One immediate advantage I saw was that navigating through multiple pages was a lot faster than before. For example, if you go the Bangalore page, then to the Pune page and then again you go back to the Bangalore page, there is no request sent to the server. Apollo gets the results from its cache. And this works with the default setup!

Apollo’s caching techniques are much smarter than Relay’s.

The InMemoryCache normalizes query response objects before it saves them to its internal data store. Normalization involves the following steps:
The cache generates a unique ID for every identifiable object included in the response.
The cache stores the objects by ID in a flat lookup table.
Whenever an incoming object is stored with the same ID as an existing object, the fields of those objects are merged.
If the incoming object and the existing object share any fields, the incoming object overwrites the cached values for those fields.
Fields that appear in only the existing object or only the incoming object are preserved.

To learn more about caching in Apollo, checkout their documentation here.

The entire application was migrated from Relay to Apollo within a few hours! In the default setup, Apollo caches the query results in the memory. This is a problem because that cache only exists until the user is on the page. Once, the user exits the page, the cache is cleared. To persist the cache in the local storage I used apollo-cache-persists. All I had to do was add 3 lines of code and now my cache persisted in the local storage!

But now, the cache would never be updated because the data is always available in the cache. To fix this, all I had to do was set the fetch policy to cache-and-network. By using cache-and-network the data is always read from cache if it’s available and the data in the cache itself is updated in the background.

Optimizing the Cache

Lastly, there was one final problem. This wasn’t a problem, rather something that could be improved. When you go to a hospitals page, all the information is refetched. But, a lot of that information would already be added to the cache when the user visits a locations page. The only extra information needed for the hospital page was their phone number, website and address. So, I just added these fields to the query which fetches all the hospitals on the locations page and made use of client.readFragment to fetch that data from the cache if it’s available.

Results of these changes

Initial render times decreased from around 5 seconds to only about a second.
Searching, filtering and sorting are now literally instantaneous.
The website can be used even when the user is offline.
The speed and responsiveness of the website is pretty much independent of the users’ internet connection, i.e. it works equally well when the user is online, offline or has a bad internet connection (assuming they’ve visited the website before).
It can now install the app and add it to their home screen to access it like a native app.
Lastly, it reduced the load on our server and database (this was never the goal but a welcome side effect!).

All in all, these changes significantly improved the performance of the website, improved the user experience and made it more accessible, specially to those living in remote areas.

Note: None of these queries are the actual queries used on the website and serves only as examples.