Golang has become a language of the web. It’s simple usage for servers and web applications make it an ideal candidate for new projects. Here at PMG, we started our first Golang application mid-2017 and have enjoyed working with it. The application is running well and development is straightforward.
However, when the project first started, and with prior attempts being unsuccessful, there was uncertainty in how to structure and build the application. Which packages should we use and how do we arrange our own? What persistence mechanisms should we use? How do we represent our domain types and business logic in the code?
Before We Get Started
In this post, I hope to clarify some ways to structure a Golang application and how everything can fit together. The following is a single example application that has been developed for the purpose of this post with the lessons learned from all of our development team. As with all software, there are many different ways to accomplish one thing.
A couple things this is:
- A fully working, example, small Golang application.
- Some general tips and tricks that have been picked up by our development team around development as a whole.
Some things this is not:
- The be-all-end-all way of doing things.
- A production-grade application.
- An example of managed infrastructure or development operations.
- An example of a secure API. Securing the API and authenticating requests is considered outside the scope of this application.
All code can be found at https://github.com/AgencyPMG/go-from-scratch/tree/20170717.1. Installation and running instructions can be found in the README.
Go from Scratch
A few things up front:
- We use Glide in this example to manage our external Golang dependencies.
- We use DBSchema to manage our database schema.
- We use Docker Compose to manage our backing services. Currently, there is just a Postgresql database. Using Docker Compose is a great way to avoid having to install development stacks on all dev computers. Installing Docker and spinning up the containers will do the job just fine. The containers must be running before the application runs.
- Configuration must be set in the environment before the application runs.
High-Level Repository Structure
- The app/internal directory is what holds all Golang code.
- The app/internal/data package is the root directory for all of our domain types.
- The app/internal/data/user package holds the
User
type. - The app/internal/data/client package holds the
Client
type.
- The app/internal/data/user package holds the
- The app/internal/cli directory holds all of our executables. I like having them all in one directory to make build scripts and locating them easier. In this case there is just one. In some other applications, this could grow to much more since there could be an executable for difference services within the application.
- The app/internal/gfsweb package holds the types and logic used in the
gfsweb
executable. We will see the connection between this package and thecli/gfsweb
package later.
- The app/internal/data package is the root directory for all of our domain types.
- The app/schema directory holds the schema definitions for our relational database.
- The bin directory holds utility scripts. On more advanced projects, this could include things like deployment, infrastructure, and Docker scripts.
- The etc directory holds our simple configuration. It could be extended to hold more complicated or application specific configuration.
- The var directory is used as a data directory for runtime artifacts. In this case we don’t have any of those. We use it for storing our Docker Compose volumes.
Exploring the app/internal/data Packages
If we were to take the main entry point of the application, we would start looking at app/internal/cli/gfswebb/main.go but that doesn’t tell much about what the application does. To understand more of what the application does, we need to look at the data package – this is where we find our domain types.
Inside the data directory (package) we house all of our domain types. The best reason is that this structure keeps all of these types under a single roof instead of mixed in with the rest of our library and logic packages. Additionally, the name data signifies that these are the data types that the application cares about. Another reasonable name would be domain.
Inside the data
package we have the Id
type. The Id type provides us a universal identifier to use for all of our data entities in the application. Having a type for an identifier signals more intent from the type as opposed to having integer or string identifiers with only the name indicating its use.
One level inside the data
package is the user
package. In here we find the User
type. Obviously, this represents a User of the application. We can see that the User
has an Id
field of type data.Id
. There is also an Email
field – we make this a requirement for a user. Finally, there are some metadata time.Time
fields that we use to track the created and updated times of the entity and an Enabled
flag. This enabled flag isn’t really used in the app, but is a good example of extra information a user could hold. More possible fields could be a password hash, first and last names, addresses, etc.
Also in the user package we find our first QueryRepo
and Repo
interfaces:
I want to make some points here about the interfaces.
- There are two interfaces all for interacting with
Users
. One is calledQueryRepo
and is used to retrieveUsers
, the other isRepo
and is used for modifying Users. We go ahead and include theQueryRepo
insideRepo
because most of the time we need to modify an entity, we will also need to retrieve it first. We will see this when we look at our command handlers. These two interfaces also point to and make it easier to implement command query responsibility segregation. - The methods in
QueryRepo
return pointer types. We are assuming that the repo is creating a new instance of the type, and once it is returned, the calling code is the owner of the variable. Thus the calling code is free to examine and modify what it gets back. - On the other hand, the methods in
Repo
accept struct parameters, not pointer values. This is to avoid the repositories from modifying the calling variables while they do their work. In essence these methods should read and store what they are called with.
Now we can look at the data/client
package. You can see it is very similar to the user
package. It has the Client
struct type and repository interfaces.
Implementing Repo
We just saw the QueryRepo
and Repo
interfaces for our entity types. Now we need to implement them in order to be able to actually store and retrieve those variables. Regardless of what storage mechanism we choose to store the entities, SQL databases, in memory, a document store, or written to a file, those implementations and their packages live inside the entity’s package. For instance there could be a clientmem or clientmongo package that understands and talks with each of their respective storage backends.
In our example, we use Postgres as a storage mechanism and the database/sql
package to interface with it from our code. The main focus of this application is not how the entities get stored – more so how everything fits together. With that said, each of the usersql
and clientsql
packages do not use an ORM. These packages use simple queries to fetch and update data and rely only upon the sql
package and another driver specific package. This code changes rarely and needs modification when the schema or entities change. Finally, we do include the app/internal/data/storage/sqlrepo package to help with common, boilerplate sql code.
Using the Repo Interfaces
Once we have the repositories set up and working, or at least defined, we can go ahead and use them in our command handlers and business logic. We are going to do this in the data/*/*cmd packages. These packages are named as such because they handle commands to alter the state of the application in some way. These are commands in the sense of CQRS.
As an example, let’s look at part of the clientcmd package.
Above we see a Handler
type that has fields for everything it needs to process all of the commands it knows about. There is also a cbus
package; we use the github.com/gogolfing/cbus package for this. In essence, Command types (empty interfaces that can be anything) are registered with handler functions so that when a Command goes through the bus, the bus knows what code to execute. That registration is somewhere else in the application, but it is why we can type assert cmd.(*CreateClientCommand)
without checking the type first.
All the command handler does, in this case, is create a new client.Client
(with all fields set) and call Add
on h
‘s client repository. This is the place for business logic to occur while modifying that state of the application. This is the code that we want to be reused from many sources to perform creation tasks. In this application, all commands are issued from the API package (which we will examine next), but we could easily extend the app to have a CLI interface that would accept a new client name and be able to create a new Client in the same exact way our HTTP API does.
Final Thoughts on the Data Packages
This is a stripped down version of the data packages we use in our applications. For instance, in our production applications, there are a lot more entity types and relationships each with a cmd and SQL sub-package.
One idea that I want to present is that of specific error types for use cases that could be hit in a normal application. Two that we can point to in this example are 1) data.ErrEntityDoesNotExist
and 2)user.ErrClientWithIdDoesNotExist
These can be simple sentinel error values or more complicated error types with more functionality if need be. The point though is to be able to have these error values/types passed around the application. Since so much code relies on the data packages, for instance, the API packages, that code can examine errors to figure out what went wrong.
For instance, the API package could send 404 status codes if a command results in a data.ErrEntityDoesNotExist
error because it knows the request is attempting to modify an entity that doesn’t exist. Or in the second case, respond with specific error messages or codes if a caller attempts to update a user with a client that isn’t in the application.
The API Package
This package is how external clients get things done within our application. By making requests to through the API they are able to access and modify entities within the application. So how does it work?
The package is located at app/internal/gfsweb/handler/api. Inside it we have the API
type that has fields for everything the API needs to do its job. Note that there is a cbus.Bus
– this is used to execute commands to change the state of the application, and query repository interfaces which are used to fetch application state. There are not full repositories in the API because we want to use the command bus and its business logic implementations to handle changes to the application.
The single exported method on API
is the Handler
method that returns an http.Handler
. This Handler should be used by an http.Server
to serve all requests. In this method we have a mux implementation – here we use github.com/gorilla/mux, but most all implementations will work. The mux allows for complex routing with different paths, methods, and route parameters to determine a single Handler
to call per request.
Now let’s look at two Handler functions for two different endpoints in the API.
The getClient
method is called for GET requests when a client id is supplied in the request URL. All it’s doing (with some help) is retrieving the Client
specified by the route id and sending it in the response.
The createClient
method is called for POST requests on the client collection request URL. All it’s doing (with some other help) is parsing a CreateClient
form, creating a command from the form, executing the command, and finally sending the response.
Those two prior explanations are supposed to seem short and simple. We want the handlers to be “dumb” and without logic. They are straightforward in that they receive input and write an output. Obviously, there are helper methods that get used quite frequently, but this is done to try to keep the handlers as simple as possible. There is no knowledge of storage in the API, just that there is a QueryRepo. There is no business logic in the API – knowing about a command bus is all it needs.
Final Thoughts on the API Package
Just like this example’s data
packages, the api
package is stripped down. There isn’t much error processing, what we get from the repositories and command bus are what we send in the response. In a more robust application, error processing would include sending different error codes and messages depending on the errors that occur. See the final thoughts for the data packages above.
Also, there is a whole other sub-package, dto
. Short for data transfer object, that package is in charge of unmarshaling inputs and marshaling outputs. This responsibility is left to API package because it is 1) at the “boundary” of the application and 2) we don’t want the external representation of the domain types to be tightly coupled with their internal representations. I.e. having the user.User
type have json tags in it would drag JSON marshaled representation of the type around with it. The approach of marshaling in the api makes the external representation much more flexible. In this case, adding XML support for the API is added in the dto package – no changes to the data package are necessary for external representation.
The gfsweb Package
With these separate packages, we lastly need to look at how all of them come together so that each part gets what it needs and the application becomes fully functional. For that we have the gfsweb
package. There are two main components in gfsweb
– the App
type and the AppBuilder
type. The App
type is somewhat simple compared to the builder, so we will take a look at the AppBuilder
.
We can see that this is the location where all the pieces get created and sent where they need to go. Since the database is a dependency of the repositories that are themselves dependencies of the command bus and the API, we create it first. Everything else follows from there.
We should note that this is where concrete implementations of all of our interfaces are built. In this case, the repositories are built from the data/*/*SQL packages. If, for instance, you decided to use Redis for some of your domain storage, then this is where you would create a Redis backed repository and send that interface throughout the application.
Finally, the App type needs 1) a server 2) a Handler for that server and 3) something to close the database connection. If there were more open connections, then we’d place those on the App as well so that the App can close them when it is told to shut down. This App
instance is created and controlled by the app/internal/cli/gfsweb/cli package. That package is the one that understands it is running a command line application and needs to bootstrap itself and run an App.
What I like about this package is that it changes rarely, or at least in a way that is very clearly defined. There may be additional repositories, additional storage backends that need connecting to, more commands required, but once this package is solidified and working, the bulk of functionality comes from the data
and api
packages even though this package knows about them and bootstraps them.
Running the Application
This section assumes that you have followed the installation and build instructions in the repository’s README. It also assumes there is no data in the database before running the following.
Start by running ./gfsweb
in a terminal.
In another terminal, run
curl localhost:8080/users
and you should get an empty JSON array indicating there are no users yet.
Next, create a User via
curl localhost:8080/users -X POST --data '{"email": "email1@example.com", "enabled": true }'
The result of this command is the api/dto
JSON representation of the User that is created.
Next, we can create a Client via
curl localhost:8080/clients -X POST --data '{"name": "Client 1"}'
Then run
curl localhost:8080/users/<id_of_user_above> -X PATCH --data '{"client_ids": ["<id_of_client_above>"]}'
Notice that because of the way our API, forms, and commands are built we don’t have to specify the entirety of the entity to update it, just what we want to change.
Try running curl localhost:8080/users/1234
You should see an error indicating an invalid Id value. Although error processing and messages leave something to be desired in this example, it is at least present and easy to build on top of.
There are other commands to run and things to play with, but we leave that as an exercise for the reader.
Conclusion
Hopefully, this has been educational in some fashion. Yes, it is small and an example, but it is functional and completed. Feel free to reach out with questions or comments about things left out or recommendations for improvement. Each of the sections in this post or sections left out could become a post on its own.