How to use domain modeling to improve subject tagging
An introductory guide to tagging: part 4
In the previous post I talked about the challenges of tagging. Here I will start to look at some approaches that can help.
One technique that I will always come back to is domain-driven design (DDD). The principles behind domain-driven design go back to Eric Evans and his book Domain-Driven Design: Tackling Complexity in the Heart of Software. The starting point for domain-driven design has similarities to domain analysis in Library Science.
In domain analysis Albrechtsen and Hjørland make the case for domain knowledge being given greater importance in the role it plays in Information Science.
Many approaches to information science and Knowledge Organisation may be understood as attempts to pass over subject knowledge (or at least not make subject knowledge explicit in their methodologies). Domain analysis, on the other hand, makes subject knowledge an explicit and important part of the methodologies of information science and knowledge organization.
Towards a new horizon in Information Science
DDD, though aimed at software developers, comes to a similar conclusion. This is that many software projects fail due to passing over subject knowledge and not making understanding the domain a central part of the process. The problem being that everyone on a team goes away and builds according to their private view of the domain. Then when it comes to putting the design pieces together they don't fit.
DDD ensures a team spend time together at the outset of a project to align their understanding. This is before they design or build anything. The success of this approach is that it supplements existing expertise with a strong and shared knowledge of the domain.
The aligning of language during these sessions is crucial. Participants talk to each other, draw for each other and do everything they can to understand where differences of understanding or terminology are lurking.
Diagrams and word documents have the habit of hiding our differences as opposed to surfacing them. True shared understanding is something that can only really be achieved by talking about the domain from as many angles as possible.
It is as simple as that.
When starting on a project to design and build anything, I would suggest the following principles:
- Recognise that caring about and understanding the subject domain is important.
- Domain experts are a critical part of the team.
- Everyone will have different views, preconceptions, simplifications they bring to a domain and this manifests itself in the language they use.
- Learning where these differences are will be expensive if learnt through writing code, building a taxonomy or designing a user interface. Better to learn this cheaply in front of a whiteboard.
- The best way to surface these differences and align language is in talking, a lot, together as a team.
- Simple domain model diagrams are a great way to share and challenge understanding of the domain.
The main outcome of all this is the shared language during conversations and in design decisions. There will likely also be some domain model diagrams. An example of this is the classic Polar Bear presentation by Mike Atherton of a sports website.
For example we might decide we want to capture tags for teams and matches. Why not stadiums? If we tag with a match we already have the stadium from the model. So we can reduce the burden on taggers to that which is absolutely necessary. We can also see we get competition round from the model as well. Each tag is part of a larger model and we adjust our tagging strategy based on our understanding of the model.
Why not tag with player? Analysis may show that there is little value in including players. For example we have seen little use of player pages in other parts of the site. Then there is no justification in tagging them. Some might argue that we need to preempt any possible use of the system in the future, so we should tag players even though this doesn't add value now. In my experience this is where we get into trouble. Keeping a tight focus on the value and the tagging needed to create that value is crucial. The domain model will help us map out possibilities but an understanding of our goals helps up pick the right path through that map.
To continue this example, if we identify that tagging with teams and matches will support the objectives of the site then we have an idea of what good looks like. This informs how we develop the tools, vocabularies and guidelines to support this goal. An understanding of the domain, the websites goals and its evolving scope have defined how we design our tagging.
That is a simple example but there are a whole range of more complex case studies to support this from Wellcome , BBC and Parliament.
I often get asked if this goes against user-centred design philosophies? Yes and no.
When working at its best DDD is complementary and works hand in hand with user centred approaches. At its core is the idea that the domain as a key organising principle. This can still sit within a user-first approach. However, on the whole I feel a team aligning its understanding and language within a domain before engaging with users can only be a good thing.
In summary, one part of the solution to better tagging is to design with a collective understanding of your systems domain.
Part 1: Why the web turned its back on the librarians. And why we need them back.
Part 2: The importance of tagging when publishing on the web
Part 3: Why good subject tagging is hard