Storing data with the file system

Published October 28, 2018 by Toran Billups

In part 5 of the url shortener adventure we persist our data to the file system.

Part 5: The file system

In the previous post we added a cache process that would rehydrate the workers' state between restarts. The next problem with the url shortener is that if we restart the IEx session all application state is lost because it's only stored in memory.

The first pass at solving this problem is to store all application state on the file system. Start by modifying the `init` function to make a new directory on start up. The files we put in this directory will be used to store the cache process state. I'm using the `handle_continue` callback now because `File.mkdir_p` could potentially block the linking process.

    defmodule EX.Cache do
      use GenServer
    
      @database "./database"
    
      @impl GenServer
      def init(:ok) do
        File.mkdir_p!(@database)
        {:ok, %{}, {:continue, :init}}
      end
    
      @impl GenServer
      def handle_continue(:init, state) do
        {:noreply, state}
      end
    end
  

The first step to hydrate process state from the file system requires we fetch a list of files from the directory `./database` we designated above. Next we transform that original list of filenames into a list of tuples with file name and file contents. Finally we take the list of tuples and throw this into a `Map` and return it so `EX.Worker` can operate as it did before this refactor.

    defmodule EX.Cache do
      use GenServer
    
      @database "./database"
    
      @impl GenServer
      def handle_call({:all}, _timeout, _state) do
        files = list_files()
        state =
          Enum.map(files, &tuple_from_filesystem/1)
          |> Enum.into(%{})
        {:reply, state, state}
      end
    
      defp list_files() do
        case :file.list_dir(@database) do
          {:ok, files} -> files
          {:error, _} -> []
        end
      end
    
      defp tuple_from_filesystem(key) do
        {:ok, contents} = File.read(file_name(@database, key))
        {key, :erlang.binary_to_term(contents)}
      end
    
      defp file_name(database, key) do
        Path.join(database, to_string(key))
      end
    end
  

And finally we enhance the insert functionality to persist process state to the file system. We use `File.write` to create a new file in the `./database` directory using file name for the `key` and file contents for the `value`.

    defmodule EX.Cache do
      use GenServer
    
      @database "./database"
    
      @impl GenServer
      def handle_cast({:put, key, value}, state) do
        @database
          |> file_name(key)
          |> File.write!(:erlang.term_to_binary(value))
        {:noreply, Map.put(state, key, value)}
      end
    
      defp file_name(database, key) do
        Path.join(database, to_string(key))
      end
    end
  

Now if you play around with this module in IEx you can restart without losing any state!

    EX.Worker.get(:worker, 'x')
    EX.Worker.put(:worker, 'x', 'google.com')
    EX.Worker.put(:worker, 'y', 'amazon.com')
    EX.Worker.put(:worker, 'z', 'ebay.com')
    EX.Worker.get(:worker, 'x')
  

To see if the cache process was hydrated correctly on start up checkout the `get_state` command.

    pid = GenServer.whereis({:via, Registry, {EX.Registry, {:cache}}})
    :sys.get_state(pid)
  

You can track my progress on github commit by commit. If you just want the code for this post checkout this commit.


Buy Me a Coffee

Twitter / Github / Email