A Reading List about Code Health

In this post I would like to give an overview about the books related to code health that had the most influence on me as a software engineer. They are also listed in my post A Reading List for Software Engineers (which is still in an unfinished state), but are covered here in more detail.

The books are not presented in any particular order, except for the first entry (which is actually a series of books), which had a profound effect on me, in the sense that it made the whole field of code health and sustainable software development accessible to me.

But before we dive into the books, let me say a few words about why I believe that code health is such an important topic and why it is essential for sustainable development at a high pace and also a determining factor for the long-term success of projects.

Why Code Health?

Code health is very difficult to define and even more difficult to measure. We have some indicators that can show the presence of potential issues, but to my knowledge there is no indicator that proves that a codebase is healthy. In other words, it is easier to spot issues than to spot quality. In my opinion, this is due to the fact that code health is to a large extent defined by higher-level structures of the system and abstractions and not so much by local low-level properties. This leaves us in a complicated situation and it makes it quite difficult to approach it from a scientific point of view.

Nevertheless, we have gathered plenty of anecdotal evidence over the last decades. And I have collected my own experiences during my professional career. In a high-quality codebase I am able to deliver plenty of value per time, while maintaining the high quality. In contrast, in a big ball of mud, I feel I am not even worth the money I am getting paid. And the difference in productivity I am talking about here is not a factor of 2 or 3, it is orders of magnitude. And yes, the plural orders is not a typo. We are talking about factors above 100 here.

So why do we so often work in low-quality codebases? In particular, this is puzzling, since fresh projects have by default a high quality. I have written about this in another post (Code Health is not a Trade-off) and do not want to go too much into detail here, but I think it is a combination of multiple factors, such as poor understanding of the requirements and the problems, inexperienced developers, time pressure and companies setting the wrong incentives.

My personal solution to all that is to apply the knowledge I extracted from the books listed below (and many other books which are not directly related to code health, but still had a great impact) and to shape those parts of the codebases which I have to work with, in such a way, to create an environment that allows me (and other developers) to perform at the highest level of productivity. The problem here is that this is not always easy and too often there are external factors involved which further complicate this. But starting with the books listed in this post is a first step in making this happen.

The Clean Code Series

This series consists of four books. These are (in chronological order) Clean Code: A Handbook of Agile Software Craftsmanship, The Clean Coder: A Code of Conduct for Professional Programmers, Clean Architecture: A Craftsman’s Guide to Software Structure and Design and Clean Agile: Back to Basics. The first and the third book of this series are all about code health, while the second is more about software craftsmanship and being a professional software engineer and the fourth gives us a sort of historical perspective on the Agile movement. I list them all here for completeness, the second book is also relevant with respect to code health, the fourth one maybe less so. But overall, I think they are all worth a read.

In any case, right after I made the transition from academia into the world of professional software development, I started reading Clean Code, Clean Coder and Clean Architecture (if I recall correctly, in that particular order). And those books had a tremendous impact on me and my view about writing software. Reading those books made me realize that the software I wrote during my time in research was pretty low quality. I was always very confident about the code I produced, but as a scientist I did not know how much better I could have done. Looking back, it makes me a bit afraid. I am still sure that my results were correct and there is a lot of empirical evidence that backs this up, but how could I be so unprofessional? Well, my main focus was on the scientific results, but writing software was one of the tools I used to obtain those results. And I used that tool quite badly.

Coming back to the books. Clean Code: A Handbook of Agile Software Craftsmanship focuses on what I would call now low-level readability. This is about smaller chunks of code being expressive and readable. In some sense, this is the foundation on which we can build larger components and whole systems, all adhering to high standards of quality. And this is exactly the topic of Clean Architecture: A Craftsman’s Guide to Software Structure and Design. Here the focus is on higher-level structures and abstractions. The principles that are explained in this book also apply to the design and architecture of whole systems or services, but the focus is more on components and relations between components. Exactly those structures, which to a large extent determine the overall code health. This should not diminish the value of low-level readability, but while I personally value low-level readability, I value clean structures and abstractions even more.

The Clean Coder: A Code of Conduct for Professional Programmers complements the other two books by providing guidelines for professional behavior as a software engineer. Definitely not with the same strong focus on code health as the other two books, but laying out why a professional conduct is important for doing our job efficiently (and also for being able to push for code health and other things we as developers know are important). For Clean Agile: Back to Basics the focus is even less on code health. Actually, for those who understand the Agile movement as a strong force for promoting code health and clean software development practices, there is quite some connection to code health. In any case, I would recommend this book for the historical perspective that it provides on the Agile movement and the more recent developments.

All in all, Clean Code, Clean Architecture and Clean Coder are all a must-read and I would recommend to read them in exactly that order. Clean Agile is a nice addition, but it would not be wrong to put it into the personal book backlog for the moment. Here are a few links to the books. Please note that there is a special offer for the electronic versions of Clean Code and Clean Coder, bundled together.

Expanding Your Toolkit

The books in this section are Design Patterns: Elements of Reusable Object-Oriented Software, Test Driven Development: By Example, Working Effectively with Legacy Code and Refactoring: Improving the Design of Existing Code. Let me start by saying that they are all a must-read. They all enrich your personal toolkit as a professional software engineer by providing important insights and generic techniques. I am pretty sure I am using some of the knowledge I got from these books on a day by day basis.

Let’s start with Design Patterns: Elements of Reusable Object-Oriented Software. During my time as a University student, a friend introduced me to this book and I read it front to back in only a few days. I did not fully understand all the aspects, but I found it fascinating to think in patterns while developing software. Unfortunately, I was not able to grasp the full value of this book at that time. This happened years later, though, after I made the transition from academia into software engineering and when I read the book for a second time. I don’t know what changed, but it was much easier to comprehend and suddenly all the patterns were easily accessible. Maybe the few experience I had with writing production software played a role here. In any case, this book contains a list of generic design patterns and presents them in a very structured way. These patterns make it easier to write, communicate and reason about software. They are mostly about lower-level design, just where software engineering gets interesting and where decisions have a lot of impact on code health.

The next one is Test Driven Development: By Example. This book is all about writing tests first in order to drive the implementation of production code. It had a profound effect on me. For a short period of around two months, I was practicing test driven development (TDD). It felt slow and awkward at the beginning, but I got used to it quite fast. The insights I gathered from that period of time are about what makes code and design testable and how this can be achieved without sacrificing any other code or design properties. Think about it for a moment, this is huge: making the code testable without being invasive. I stopped with strict TDD after those two months and currently I am practicing something that I would call pseudo-TDD (I guess I have to write about that in the future). By the way, this is also something quite typical that happens when I read books: I try to absorb the knowledge that is within the books and use it to improve / complement / adapt my personal practices and views as a software engineer, while I usually do not fully buy in to the more dogmatic parts. That does not mean that there is anything wrong with strict TDD, quite the contrary, I can only recommend everyone to read this book and to practice TDD (at least for a while). For me personally, I tried it for a bit and that was sufficient for me to extract the underlying principles and incorporate those into my personal practices. So, saying that my current style is influenced by TDD would be definitely an understatement.

The last two books in this section, Working Effectively with Legacy Code and Refactoring: Improving the Design of Existing Code are sort of related to each other. Both books focus on non-functional changes, so-called refactorings, but within different contexts. The first one focuses on how we can regain control over legacy code. This book contains a large number of specific techniques that can be used to incrementally transform legacy code into a more healthy state. If you ever became desperate while trying to add tests to existing code, while breaking up a large method or while separating responsibilities of a single huge class, this is the right book for you. The second book approaches refactorings from a different angle. In this book the term code smell was coined. This term depicts an unusual structure in the code that could potentially cause issues in the long-term. Many such code smells are presented in the book, together with the techniques that can be used to remove them. In my opinion, those two books nicely complement each other and both should be part of any library of a professional software engineer.

Again, here some links for the books discussed in this section.

More Perspectives

In this section I would like to introduce a few books which I read after the books that I discussed earlier. Due to the knowledge I already had while reading those books, it is difficult for me to judge how much value they would provide by themselves in isolation. All I can say is that they were very good reads, refreshed my memory on many different aspects of code health, provided different perspectives on certain topics and offered new insights which made me adapt and improve my personal practices even more. For the latter it is not clear how much of the insights are actually completely new and how many I missed while reading other books. This is also why I tend to re-read books years later while having a more profound background knowledge and a completely different context. In any case, the following books were definitely worth my time: The Art of Readable Code: Simple and Practical Techniques for Writing Better Code, A Philosophy of Software Design and Understanding Software: Max Kanat-Alexander on simplicity, coding, and how to suck less as a programmer.

I would recommend all of these to anyone who wants to dig deeper into the topic of code health and who has already consumed most of the other books mentioned earlier. As usual, here the links to the books.


I hope this list provides some value for fellow software engineers, in particular to those who are at the very beginning of their careers. If this is the case, please feel free to share it with your peers. And please let me know in the comments what you think about this list and if there is anything missing here.

A Reading List for Software Engineers

IMPORTANT: This post is not in its final desired form, yet. The ratings (probably a finer granularity) and categories still need some work and a lot of descriptions are missing.

In this post I would like to assemble a list of resources that I found helpful for my work as a software engineer. Since my views slowly change over time, I will try to keep this post up to date and continuously add / remove / adapt the content. To make it easier to consume this reading list, I will put the different books into different categories, potentially having a book in multiple categories at the same time. I will also provide for each book one or more ratings out of the categories essential, useful, optional and classic.

As usual, a short disclaimer: Everything I present in this post (and in this blog in general) is my personal view and does not represent the view of my current employer, my former employers or any future employers.


Code Health

Refactoring: Improving the Design of Existing Code
[Rating: essential]

Design Patterns: Elements of Reusable Object-Oriented Software
[Rating: essential]

Working Effectively with Legacy Code
[Rating: essential]

Test Driven Development: By Example
[Rating: essential]

Clean Code: A Handbook of Agile Software Craftsmanship
[Rating: essential]

Clean Architecture: A Craftsman’s Guide to Software Structure and Design
[Rating: essential]

The Art of Readable Code: Simple and Practical Techniques for Writing Better Code
[Rating: useful]

A Philosophy of Software Design
[Rating: useful]

Understanding Software: Max Kanat-Alexander on simplicity, coding, and how to suck less as a programmer
[Rating: essential]

Algorithms and Data Structures

Introduction to Algorithms
[Rating: essential]

This book is essential for building a solid foundation regarding algorithms, data structures and to some extent also computational complexity.

I first stumbled upon an earlier version of this book during my studies in computer science and really enjoyed reading it. The book itself is quite verbose, covers a large amount of topics and contains a lot of illustrations. Overall, it is a very good introductory book, as well as a valuable reference later on.

On top of that, the book can be very useful for interview preparation.

[Rating: useful]

Not quite as good as Introduction to Algorithms for my taste, but still a very good book.

The focus is more on practical aspects of algorithms, while Introduction to Algorithms covers the practical as well as the theoretical side. For this reason it might actually be the preferred choice for some engineers.

This book can also be very useful for interview preparation.

The Art of Computer Programming
[Rating: classic]

These books are definitely among the most famous books about computer programming of all time. Having started with this project in 1962, Donald Knuth published the first three volumes between 1968 and 1973. Volumes 4 and 5 were published in 2011 and 2015, respectively. Volume 6 is about to be published and volume 7 might conclude this compilation at some point in the future.

I personally bought volumes 1-4 at the beginning of 2017. So far I have just read very few parts. But since I added these books to my personal library my productivity miraculously increased by 2-3%.

Not an easy read, though.

Programming Languages


Effective Modern C++: 42 Specific Ways to Improve Your Use of C++11 and C++14
[Rating: essential]

Effective C++: 55 Specific Ways to Improve Your Programs and Designs
[Rating: useful]

The C++ Standard Library: A Tutorial and Reference
[Rating: useful]

Modern C++ Programming with Test-Driven Development: Code Better, Sleep Better
[Rating: useful]


Fluent Python: Clear, Concise, and Effective Programming
[Rating: essential]

Effective Python: 59 Specific Ways to Write Better Python
[Rating: useful]


Java: The Complete Reference
[Rating: useful]


The Go Programming Language
[Rating: essential]


Practical Vim: Edit Text at the Speed of Thought
[Rating: essential]

The Linux Command Line: A Complete Introduction
[Rating: useful]

Docker Deep Dive
[Rating: useful]

Shell Scripting: How to Automate Command Line Tasks Using Bash Scripting and Shell Programming
[Rating: useful]

Pro Git
[Rating: useful]

Mercurial: The Definitive Guide
[Rating: useful]

Software Craftsmanship

Coders at Work: Reflections on the Craft of Programming
[Rating: useful]

97 Things Every Programmer Should Know: Collective Wisdom from the Experts
[Rating: useful]

97 Things Every Software Architect Should Know: Collective Wisdom from the Experts
[Rating: useful]

The Passionate Programmer: Creating a Remarkable Career in Software Development
[Rating: useful]

Effective Programming: More Than Writing Code by Jeff Atwood
[Rating: useful]

How to Stop Sucking and Be Awesome Instead
[Rating: useful]

A Programmer’s Rantings: On Programming-Language Religions, Code Philosophies, Google Work Culture, and Other Stuff
[Rating: useful]

Software Developer Life: Career, Learning, Coding, Daily Life, Stories
[Rating: useful]

The Pragmatic Programmer: your journey to mastery
[Rating: useful]


Clean Agile: Back to Basics
[Rating: useful]

The Clean Coder: A Code of Conduct for Professional Programmers
[Rating: useful]

Becoming a Technical Leader: An Organic Problem-Solving Approach
[Rating: useful]

Debugging Teams: Better Productivity through Collaboration
[Rating: useful]

The Dip: A Little Book That Teaches You When to Quit (and When to Stick)
[Rating: useful]

So Good They Can’t Ignore You: Why Skills Trump Passion in the Quest for Work You Love
[Rating: useful]

Crucial Conversations Tools for Talking When Stakes Are High, Second Edition
[Rating: useful]

Crucial Accountability: Tools for Resolving Violated Expectations, Broken Commitments, and Bad Behavior
[Rating: useful]

The Healthy Programmer: Get Fit, Feel Better, and Keep Coding (Pragmatic Programmers)
[Rating: useful]

The 7 Habits of Highly Effective People: Powerful Lessons in Personal Change
[Rating: useful]

Getting Things Done: The Art of Stress-Free Productivity
[Rating: useful]

Great at Work: The Hidden Habits of Top Performers
[Rating: useful]


Code Complete: A Practical Handbook of Software Construction, Second Edition
[Rating: classic]

The Mythical Man-Month: Essays on Software Engineering
[Rating: classic]

Code Health is not a Trade-off

It is a popular perception in software engineering that there exists a trade-off between code health and delivering software fast. This is a huge misconception fueled by the fact that we very often maneuver ourselves into such a situation, where we suddenly have to decide between improving the quality of our codebase and delivering new features.

Yes, this trade-off is real and I have experienced it myself more often than I would have liked to, but the underlying problem is that we allow this trade-off to materialize in the first place without actually getting anything in return.

In the remainder of this post I will look into some of the reasons that create such trade-off situations, how we could avoid them and what we could do to get out of them.

Before going more into detail, a short disclaimer: Everything I present in this post (and in this blog in general) is my personal view and does not represent the view of my current employer, my former employers or any future employers.

Accumulating Technical Debt

Having a large amount of technical debt is the reason why we experience this trade-off between code health and delivering features fast. So why are we getting into this vortex of technical debt? There are three reasons for that.

First, we very often incorrectly perceive that by going into technical debt we are able to gain short-term advantages which potentially outweigh all the long-term downsides. But in software development this is just not possible. Alright, there are a few notable exceptions here, like the one-off script, the small prototype or the feasibility study. But even in those cases, how often did we need our one-off script more than once? How often did our small prototype grew over time until it made it into production? How often did our feasibility study require time-consuming debugging until it gave correct results? I do not want to claim that those exceptions do not exist, but they are extremely rare. In all other cases, there are no significant short-term benefits from going into technical debt. Technical debt hits us much faster than we expect, even for single-person short-term projects, but more so for multi-developer projects with a long lifetime.

Second, we sometimes just do not know how to do it better. Each developer has a specific set of skills and experiences. Even by doing our best, we might immediately introduce additional technical debt. And this is one of the reasons why I personally do not like the term technical debt. While we go into technical debt consciously, but based on false premises, in some cases (first point), we mostly go into technical debt accidentally (this point and the next one). And this is not how you typically get into other forms of debt.

And third, even in cases where we do know how to implement things perfectly, we get it wrong. Actually, we get it wrong all the time. And not because we are missing skills or experiences, but because we are not able to predict the future. We are implementing something in software (and not hardware), because we expect the implementation to change over time. Even in cases where our current implementation is perfect (whatever that means, but this is a different story), it might be far from ideal in one month from now or in half a year. And that means that we get additional technical debt just because the context (e.g. constraints, requirements, scale, etc.) changes over time. Again, the term debt does not seem appropriate here.

How to avoid Technical Debt?

Looking at the three different ways in which we are adding technical debt, how can we best avoid it? First of all, lets not get lured into consciously adding technical debt by the false premises of short-term gains. In most cases, this is not what we get back in return. So what about the other two cases in which we accidentally add technical debt?

We can reduce the amount of technical debt that is added because we were not able to find better ways of implementing or designing a feature by getting better ourselves. Continuous learning and improvement is something I am very passionate about. It takes time, but there are plenty of resources (books, videos, courses, etc.) available that could help us to avoid repeating the same mistakes we (or other developers) did in the past. Those are techniques like test-driven development, pair programming, code reviews, refactoring, patterns and many more. I always wanted to compile a list of useful resources and might be able to add it in the next days, so you have some starting points.

What about the technical debt that is added because the context of our project has changed? Well, we can easily avoid this by getting better at predicting the future. Just kidding. While we might get better over time about making reasonable predictions of the future, this will never be sufficient to avoid this issue altogether. What we can do is to accept that we will get things wrong and to design the system with that in mind. This is not easy at all and in my opinion one of the most difficult things in software development. But by designing the system for change, by deferring decisions and by keeping the codebase simple and modular, we can mitigate a lot of technical debt as soon as it occurs.

Whatever we do, we will sooner or later accumulate technical debt. Even a team with the most experienced and skilled engineers will have to deal with technical debt. The important thing here is to continuously deal with it and to reduce it as soon as it occurs. As long as we are able to keep technical debt below a critical threshold (and this threshold is not very large), we are not getting into the situation where we have a trade-off between code health and feature development. This is what we should aim for. But what if we are above that threshold?

How to reduce Technical Debt?

Working in a project with a large amount of technical debt is not a pleasant situation to be in. As I said before, this is something we should try to avoid from the very beginning. Don’t get into such a situation.

But what if it is too late or if you just joined a new project which is in an unhealthy state? It can be extremely stressful to work in such an environment. If that is the case for you and if the situation is not improving, I would recommend to look for other opportunities and to not remain in that situation for too long.

In any case, if we are in such a situation, we basically have to deal with the uncomfortable trade-off between code health and feature development. That means there has to be someone in the decision chain which supports reducing technical debt. If that is not the case, I doubt that such an endeavour could be successful and would like to refer to what I said in the previous paragraph. If, on the other hand, reducing technical debt is a priority, it will still be a very difficult task, but there are a few things we can (at least try to) do.

Replacing the whole system is most of the times very risky and something I would generally advice against. Instead, I would try to reduce technical debt more locally. I would start by improving testability and test coverage in order to increase the confidence in the codebase and to make further changes easier. Being able to change code with confidence might allow us to modularize some components and decouple them from the rest of the system. Replacing such components with better implementations does not come with the same risks as doing it for the whole system and is something that worked very well for me in the past. Apart from that, as a general rule we could try to simplify any code we touch for maintenance or for feature development. This allows reducing technical debt while developing new features and might slowly move the whole project into a more healthy state (where other changes might become feasible).

All in all, this is not easy. I am not sure if I mentioned that before, but the whole point of my post is to avoid getting here in the first place. And if you end up here, then good luck.


There is no fundamental trade-off between code health and delivering features fast. We experience this trade-off because of a large amount of technical debt. Being in such a situation is very unpleasant and significantly slows down the progress of projects. In the extreme, it makes projects fail.

And this is why we should be mindful about technical debt, the reasons why we accumulate technical debt, how to reduce technical debt and most importantly, how to avoid adding technical debt in the first place.

This is easier said than done and I hope I will find the time to get into more concrete examples here in this blog. Please let me know if there are any specific topics you would be interested in.

Scrum is fragile, not Agile

As the title suggests, this post is about two different aspects of Scrum. The first part deals with Scrum not being Agile and the second part is about Scrum being fragile.

Before going more into detail, a short disclaimer: Everything I present in this post (and in this blog in general) is my personal view and does not represent the view of my current employer, my former employers and any future employers.

Scrum is not Agile

I guess a typical reaction to this heading would go like “How is this possible? Scrum is not Agile? Isn’t Scrum the number one Agile software development process?”. The short answer is that Scrum claims to be an Agile process, but the sad reality is that Scrum is quite far from being Agile. I will show you why.

Lets have a quick look at the Agile Manifesto. It states that it values “Individuals and interactions over processes and tools”. Lets also have a quick look at the meaning of the word agile. According to the Oxford Dictionary agile means “Able to move quickly and easily”. It is not a coincidence that the term agile has been chosen to represent the high-level ideas within the Agile Manifesto. In fact, one major point behind Agile is that in many software projects it is extremely difficult to move quickly and easily. This is not the case for a completely new project, but over time many projects get into a situation where sustainable development is simply not possible anymore. To prevent this (and other issues), the Agile Manifesto and the Principles behind the Agile Manifesto provide several high-level guidelines. These guidelines are not specific well-defined processes or tools and they allow for many different implementations. I suspect that both of these properties (high-level and allowing different implementations) were fully intended. The overall goal was not to present a silver bullet, but to help peers to avoid many of the pitfalls in software development, which the authors of the Agile Manifesto experienced first-hand and which fall into exactly these categories.

Now lets have a look at the Scrum Guide (written by two of the authors of the Agile Manifesto). In contrast to the Agile Manifesto and the Agile Principles, this guide seems quite lengthy. Surprisingly, the whole guide does not mention Agile a single time. I am not sure if this was historically always the case, but if the authors of the Scrum Guide do not claim that Scrum is Agile, then we would already be done with the first part of this blog post. I assume that this is not the case, so lets move on. The Scrum Guide is about a framework which contains “roles, events, artifacts, and the rules that bind them together”. In other words, it is a very specific and well-defined process. This does not sound agile and it also does not sound Agile (remember: “Individuals and interactions over processes and tools”). This is quite ironic and obvious. And this is where the whole Scrum movement should have stopped. But it did not and instead frustrates an increasing number of software developers all around the world. And whenever a Scrum project fails, it is not because of Scrum’s potential flaws, but because Scrum was not implemented correctly. That sounds like a nice transition into the second part of this post.

Scrum is fragile

This part is very short. I thought that the wordplay (Scrum being agile / fragile) is kind of funny and apart from that it perfectly describes one of the things that really bother me about Scrum: Whenever a Scrum project fails, it is because Scrum was not implemented correctly. And you can read about a vast amount of such projects. What does it mean, if a large number of intelligent software developers are not able to implement Scrum correctly? It means the whole framework is fragile. And this is another major argument against using Scrum. What is a framework good for, if it is so difficult to use?

Well, it seems that with the help of expensive consulting and coaching, as well as training and certificates, Scrum might in fact provide value. But it is not clear if this is value for the companies developing software and the hard-working software developers or for those who offer services in and around the Scrum ecosystem.

Personal View

I would like to finish this post with a bit of my personal view regarding software development, Agile and Scrum. To me it seems that one very important part of high quality software development is to maintain a simple priority queue of tasks. The weight is a combination of the value a task provides for the customer / developers and the estimated effort to implement this task. For some developers this comes naturally. For teams and companies for which this is not the case, Scrum offers a rather expensive and inefficient implementation of a priority queue.

And lets be honest. Software development is a very difficult and complex work. Are we really surprised that so many projects fail? The field is still very young and we need to learn a lot. And this is crucial: We need to learn from past experiences, let it be failures or success stories. And here we collectively fail. We are not using the wrong processes or implementing the right processes in the wrong way. We are simply caught in a rat race and not able to make a short break in order to look at and learn from all the things that happened around us, maybe even before our time. It is our duty to extract the knowledge, the experiences and the wisdom from the many resources that are so easily available to us: The many many books, articles and videos about software development and, last but not least, the Agile Manifesto.

The United States Presidential Election – The hope for political changes in favor of the 98%

I am following with great interest the recent developments of the campaigns for the 2016 United States presidential election. During the last years I have lost hope in political change, mainly due to the extreme and ever increasing concentration of power and wealth all around the world. We are living in a system where power and wealth are factually synonymous and concentrated in the hands of a small minority. A system which consists of deep and complex interdependencies among power structures such as politics, media, police, army and intelligence agencies. Even for an optimist it should be difficult to envision how this situation could change to the better. I personally thought that we already missed the point in time where such a change would have had been possible and that a certain kind of revolution in the near future seemed unavoidable. But then I heard about Bernie Sanders, one of the potential candidates for the democratic party for the 2016 United States presidential election. Suddenly, change seems possible again and I hope with all my heart that the voters in the United States make use of this historic opportunity. Let me be more clear:

If your annual income is less than $200,000 voting for Bernie Sanders is without alternative, independent of your ethnic background, your religious beliefs, your favorite party, your gender or your age.

This seems to be a strong statement, but it is exactly what I personally believe to be rational. Let me briefly explain why. As mentioned before, power and wealth are factually synonymous and concentrated in the hands of a small minority. This means that those in power have a strong self-interest to maintain, or even to increase, this predominant inequality. The campaigns of almost all candidates are for the most part funded by wealthy individuals and big corporations. This is very suspicious for a democracy where the power should be vested in the people and this fact alone should raise your awareness. There is an exception though: Bernie Sanders.

Currently, the majority of the population is getting systematically exploited by a small minority. Most of the hard-working people do not get their fair share of the cake, families and children are living in poverty, while in the US more than 50% of the income is going to the top 10% and almost all of the income increase in recent years is going to the top 1%. In addition, the current system fosters an irreversible destruction of the environment. Those are issues which are not acceptable and of extreme urgency. Let me give you a few facts to think about (together with their sources).

  • The wealthiest 85 people on the planet have more money than the poorest 3.5 billion people combined. [1]
  • The amount of money that was given out in bonuses on Wall Street last year is twice the amount all minimum-wage workers earned in the country combined. [1]
  • Since 1990, CEO compensation has increased by 300%. Corporate profits have doubled. The average worker’s salary has increased 4%. Adjusted for inflation, the minimum wage has actually decreased. [1]
  • 110 million Americans live amongst such high levels of air pollution, the federal government considers it to be harmful to their health. [4]
  • The United States has 5% of the world’s population, but 25% of the world’s prisoners. [3]
  • Each year, humankind adds six to eight billion tons of carbon to the atmosphere by burning fossil fuels and destroying forest, pumping up the concentration of greenhouse gases responsible for global warming – an effect that could raise temperatures by three to ten degrees by the year 2050. [5]
  • In 2013, 45.3 million people lived in poverty in the USA. [2]
  • The poorest half of the US owns 2.5% of the country’s wealth. The top 1% owns 35% of it. [1]
  • The top 1% of America owns 50% of investment assets (stocks, bonds, mutual funds). The poorest half of America owns just 0.5% of the investments. [1]
  • Up to 400,000 people are killed each year in the US due to preventable medical errors. [6]
  • $765 billion, or 30% of all US healthcare costs, each year is wasted. [6]
  • The US spent $80 billion on incarceration in 2010 alone. [3]
  • Three out of four young black men in Washington, D.C., can expect to serve time behind bars. This is despite the fact that people of all races use and sell drugs at the same rate. [3]
  • More than 96% of convictions in the federal system result from guilty pleas rather than decisions by juries. [3]
  • A large study has found that up to one half of all plants and animals species on dry land could face extinction by the year 2050 due to global warming. According to the World Resources Institute, 100 species die each day due to tropical deforestation. [4]
  • 92% of US physicians admitted to making some medical decisions based on avoiding lawsuits, as opposed to the best interest of their patients. [6]

This is definitely not a world in which I want to live and I am sure that most people would agree with me. We cannot expect candidates like Hillary Clinton or Donald Trump to change things to the better for the majority of the population, the not super wealthy, or in other words: the 98% (sometimes also called the 99%). The campaigns of those candidates are funded by the establishment and those campaign contributions are not just donations, they are investments. Therefore, my hopes are with Bernie Sanders, and so I hope are your’s. If we want to live in peace with respect for the environment and its limits, if we want social securities, a working health care system and a law system based on justice, there is simply no alternative to Bernie Sanders.

I am not an American citizen myself. I was born in Germany and I am currently living in Switzerland. The problems described here are global problems and not only an American issue, although many of them are more extreme in the US than in other countries. I will probably take a closer look at some of these issues in future blog posts. For the moment I am just excited about the mere possibility of political change in the US, which might even spread to the rest of the world.


[1] http://www.alternet.org/economy/35-mind-blowing-facts-about-inequality
[2] http://www.povertyusa.org/the-state-of-poverty/poverty-facts/
[3] http://mic.com/articles/86519/19-actual-statistics-about-america-s-prison-system
[4] http://www.treehugger.com/clean-technology/20-gut-wrenching-statistics-about-the-destruction-of-the-planet-and-those-living-upon-it.html
[5] http://www.lightparty.com/Economic/EnvironmentalFacts.html
[6] http://www.forbes.com/sites/robertszczerba/2013/10/22/six-frightening-facts-you-need-to-know-about-healthcare/

Inherent Limitations for Benchmarking Problems beyond NP

Today I would like to talk about some difficulties regarding benchmarking of hard problems. With hard problems I mean problems which are supposed to be harder than NP, or related classes, from a computational complexity point of view. This is a very interesting topic on which I am currently working in collaboration with Cassio P. de Campos from Queen’s University Belfast. The motivation for this work comes from the fact that we are both confronted with very difficult computational problems in our research. In the past I have worked a lot with stochastic combinatorial optimization problems, which might be extremely difficult to solve or even to approximate. For example, the Probabilistic Traveling Salesman Problem with Deadlines is #P-hard, which implies NP-hardness and is a much stronger hardness result. In fact, already the evaluation of the objective function for this problem is #P-hard. The question is, how can we empirically assess the performance of algorithms for difficult problems such as the Probabilistic Traveling Salesman Problem with Deadlines?

The situation is not too bad if we are able to compute the value of an optimal solution in a reasonable amount of time and if we are able to evaluate solutions in a reasonable amount of time. But since the problem is #P-hard this is only possible for very small instances. Instances of practical relevance might not belong to that category. What can we do in that case?

If we create a benchmark by sampling from the set of relevant instances in a fair way, we run into the following two problems: We are in general not able to compute an optimal solution (or the value of an optimal solution) for the benchmark instances and we are in general not able to evaluate solutions provided by algorithms on the benchmark instances (in many cases not even approximately). How can we empirically assess the performance of algorithms in such a setting? It seems that we cannot do a lot here. An algorithm returns a solution and we are not able to evaluate, or even estimate, the quality of such a solution.

How can we circumvent this problem?

Are there any ways to circumvent this problem? Well, we could sample and select instances of relevant size for which the above problems do not apply for some reason, or we could use an instance generator in order to create instances together with certain information about these instances, for example the value of an optimal solution. These two approaches  are commonly used and therefore I will thoroughly discuss them in the following.

The problem with the first approach is that we are not really creating a benchmark which is representative for the set of relevant instances. Let me give you an example. Let us consider the following approach: We sample a certain number of instances of relevant size. We then apply an exact solver to these instances with a quite huge time limit. After that, we select only those instances for further usage, which could be solved by the exact solver within the given time limit. The problem is that we are not sampling anymore from the set of relevant instances, but instead from the set of instances which can be solved by an exact solver within a given time limit. This introduces a kind of bias and there is a serious threat (in fact it might be quite likely) that the resulting instances are not representative for the set of relevant instances.

The second approach runs into some theoretical limitations. Whenever we use an instance generator to create instances together with certain additional information about these instances, we are also able to retrieve these additional information using an algorithm in NP or some related complexity class. If these are information that are useful for the benchmarking process, then they are also very likely useful for the given computational task. That means for problems beyond NP such an instance generator cannot exist, unless the corresponding complexity classes collapse. Or in other words: Any attempt to generate instances in such a way leads to a bias in the benchmark, because it is not possible to create instances from the set of relevant instances together with certain additional information without introducing some kind of bias.


Both approaches are not very satisfactory and introduce a certain kind of bias into the benchmarking process. It seems that we are not able to empirically assess the performance of algorithms for hard problems on instances which are representative for the set of relevant instances. We can only hope that the results on our biased benchmark instances generalize to the set of relevant instances, but that does not seem very scientific to me. Computational problems are becoming more and more complex nowadays and there is a huge amount of problems with a computational complexity beyond NP. We are not able to solve them exactly and we are also not able to empirically compare algorithms for such problems. I think what we ultimately can do, is to attack these difficult problems from a more theoretical point of view, for example by finding approximation algorithms for related computational tasks or by identifying important special cases which are tractable.

The Research System – Its problems and what we can do about it

This time I would like to talk about our current research system. With research system I mean the framework in which research is conducted at the beginning of the 21st century. In my opinion there are many different problems with the current system, some of which lead to false incentives and ultimately corrupt science. I will start with a discussion of what I think are the main issues with our research system. After that I will talk about some ways in which we researchers could actively influence and change this system into a better one. At the end I will present some ideas of how I personally want to cope with this flawed system in the future. I can already tell you at this point that I am not leaving academia. At least not now.

The problems of our research system

There exist many blog posts and articles about the problems of the research system. Most researchers are aware of these issues or even frequently experience them. The huge problem is that nobody is doing anything about it. With our passive behavior we are gambling the reputation of science. This needs to change.

We are conducting research within a framework that consists for the most part of non-democratic structures. There are huge differences with respect to the power and influence of different researchers within this framework, which cannot be simply explained by the quality of their work. There are third parties involved, mainly in the publication process, which impede our research work, mostly because of financial reasons. We decide whether a work is worth publishing or not using a system which cannot work for very simple game theoretical considerations. And finally, we measure the quality of research and the success of researchers using certain flawed indicators. This all together leads to many false incentives for all the parties involved in that process. This might not be a problem in other situations, but here we deal with persons which presumably belong to the smartest in our society. And therefore it is of greatest importance that we work within a system which gives the right incentives in order to perform good research.


To a large extend we measure quality and success based on indicators. The quality of a publication is measured by the number of citations it gets. The quality of journals is measured by the impact factor, which is the average of the number of citations from publications within this journal. Our own success is measured by the number of publications we publish and the number of citations our publications receive. Researchgate uses indicators called impact points and rg scores and google scholar uses indicators called h-index and i10-index to measure our success.

The problem with all these indicators is that they do not accurately reflect the quality of our work. At the very best there is a slightly positive correlation. Let me give you an example. At google scholar I have 163 citations, an h-index of 8 and an i10-index of 7, at researchgate I have 6.85 impact points and an rg-score of 8.28. Another researcher has 8150 citations, an h-index of 34 and an i10-index of 57 at google scholar and 72.06 impact point and an rg-score of 28.95 at researchgate. Comparing these indicators, it seems that this researcher is performing much better research than me. What if I tell you that this researcher is Zong Woo Geem, whose career is based on his “invention” of the dubious harmony search algorithm and whose academic fraud was recently exposed by myself (see my previous blog post for more details: The Harmony Search Algorithm – My personal experience with this “novel” metaheuristic)?

If these indicators do not accurately reflect the quality of our work, why are we actually using them to measure the quality of our work? Instead of conducting high quality research, the system gives incentives to optimize indicators. And such a system obviously leads to a research output which improves the indicators and not necessarily to a research output of high quality. This is definitely not a feature we want to have in a research system.

Peer reviewing

We use peer reviewing to decide what is getting published in journals and what is getting accepted for conferences. There are several studies which show that peer reviewing does not really work as intended. For example, some studies show that results of peer reviewing are extremely stochastic. Therefore a small number of reviewers, which is commonly used for journal articles and conference submissions, cannot be used to adequately estimate the quality of a given work. Let me give you another argument for why peer reviewing does not work. For this purpose, lets look at it from a game theoretical point of view (I am not aware of any such treatment in literature, please let me know if you know any such sources). So lets assume that you are asked to review some manuscript. What are your incentives to do a good job? Apart from any ethical reasons or issues with your conscience, I cannot think of such incentives. But are there incentives for not making a good job? In my opinion there are a lot of such incentives. It makes sense not to invest too much time for the review, since in this way you could better devote your time to other parts of your research work, maybe something which might help you with your own career. It also makes sense to be biased with your assessment of the manuscript. It is very likely that the authors of the manuscript work in the same field as you do. If they are friends you might want to be very positive in your review, or if the manuscript contains serious flaws, you might want to prevent them from harm. If they are competitors you could do the exact opposite, you might want to be very negative with your review, or if the manuscript contains serious flaws, you might want to suggest publication in order to criticize the authors later. Game theoretically speaking, this leads to a kind of dilemma situation between behaving ethically correct and doing a good job on the one side and behaving egoistic and possibly doing a bad job on purpose on the other side.

Some of those issues are thoroughly covered and well-known. Still we heavily rely on peer reviewing to assess the quality of our work. Most researchers agree with me, that peer reviewing is flawed, but then they justify its usage based on the fact that there is no better method available. It hurts to hear such an argument from researchers, who are supposed to change things and to facilitate progress. At certain times there was no better system known than slavery or monarchy. People slowly understood that these systems were flawed. Instead of keeping them, because no alternatives were available, they were actually looking for alternatives (By the way, this should also trigger us to think critically about the political and economical system in which we are living). This is what we should do as well. Instead of relying on an approach, which we all know is seriously flawed, we should try to find alternatives. I will come back to this issue later in this blog post.


The role of publishing companies in today’s research system is kind of awkward. They run a big business with huge margins for profit, but their benefits to research are not clear at all. I would even go one step further and say that they are actively impeding research work. Lets have a closer look at that.

It is, and always was, of great importance to make research results available to others. In the past this has been done in the form of letters, journals or books. Here publishers played a key role. For them the whole thing was a business with financial incentives. They were providing a certain service which was important for the process of communicating research results. They were not necessarily interested in promoting excellent research, but since they provided the research community with a service which had a certain value, this seems completely fair in my opinion.

This situation has changed dramatically due to the Internet and the personal computer. Using modern editing tools on personal computers, we are actually preparing high quality papers (here I mean high quality with respect to typesetting and related issues), which are basically ready for publishing. Additionally, the Internet allows us to easily make our research results instantaneously available to the wide public. But instead of that we are still using those old structures that have developed in the past, and we still publish our results with the help of publishing companies.

The crucial question here is, which kind of service are those companies actually providing to the research community? I personally do not see any kind of service they are actually providing, except their existing structures (which are probably not even a service in that sense). In my field we submit so-called manuscripts to conferences and journals. In the original meaning, “[a] manuscript is any document written by hand or typewritten, as opposed to being mechanically printed or reproduced in some automated way.” What an irony, nowadays our manuscripts are basically pdf documents which are ready for publishing. That means, the only thing these publishing companies are doing, is, apart from some minor things, to take our documents and to print them or just to put them online. For obvious reasons, we do not need printed versions anymore in the 21st century. That means they are taking our documents and putting them in the Internet. Is that really a service? And on top of that they make huge profits with that system.

And it is getting even worse. Using these kind of obsolete structures, they actively impede research. First of all, results could be made available much faster without using the “services” of publishing companies. But the real problem is, that the publishing companies are aware of the fact that they are actually providing little or no service. For this reason they try to make us dependent on these obsolete structures in order to continue with their antiquated business model. This happens on many different levels. For example, in the form of so-called “top” or “high impact” journals, by enforcing us to rely on indicators, or by “sponsoring” research. Another point is that their profits rely heavily on the number of publications they are able to sell back, for example to universities. This gives them a financial incentive to publish vast quantities of papers and they try to impose this incentive on us researchers.

The whole point is, we do not need these sort of publishing companies anymore. Using personal computers and the Internet, we should be able to easily communicate our research results without the help of any third party involved.

Power structures

The current research system relies on several non-democratic structures and on processes which are far from transparent. In this environment, it is extremely difficult, to obtain some idealistic purpose such as good research. At this point I would like to briefly discuss two of these issues, the typical career path in academia and the power of researchers that are journal editors or part of a conference committee.

The typical career in academia consists of something like finishing your studies, doing a PhD, working a few years as a PostDoc and then trying to somehow get a professor position or quit at some point in between. This is a very inflexible career path which comes at the cost of many inherent problems. Apart from that, it is commonly assumed that as a PhD student you are basically thriving on the ideas of your supervisor and that as a PostDoc you slowly start with independent research, still under the supervision of a professor. While in reality, some PhD students are already doing great independent research and their supervisors are simply exploiting them (in order to optimize their own indicators). I am doing independent research since years, I have been exploited, and still I have to prove that I am able to conduct independent research. This whole career path is strongly intertwined with the indicators I have discussed before. In order to advance in your career, you have to optimize your indicators. Nobody really cares about your potential as a researcher. There are definitely many brilliant researchers who manage to make a successful career in academia, but due to our increasing focus on indicators, there is also an increasing amount of researchers which make a successful career because they are better in optimizing their indicators or simply because they know how to play the game. And this creates a vicious circle.

The other point is that certain researchers are equipped with an immense power in the current research system. Journal editors and committee members of conferences have the power to shape the direction of research. They are those people who decide what gets published and what does not get published. The problem is not that there are people who have such a power, the problem is that they are selected in a highly non-democratic way and that they are able to execute their power in a non-transparent way. No wonder that we are supposed to do excessive networking at conferences instead of research.

How can we change this situation?

The current state of the research system is not acceptable. But it will not change if we just hope that those parties who profit from this system will alter their behavior and act in a more idealistic way. If we want to change something, we have to do it ourselves. The reason why I still want to continue in academia is, that I somehow hope it is possible to change things. If it turns out during the next years that it is not possible to change anything, then you will find me among those researchers which simply quit and leave academia (maybe with another of those infamous blog posts). Maybe I will find another way to pursue my passion of research, but I will definitely not become some indicator optimizer.

So, how can we change the situation? This is a complex and difficult task and I would like to invite everyone who managed to read that far, to discuss this with me (by email, by twitter @dennisweyland, or by commenting on this blog post). I have three suggestions that I would like to share here.

The first suggestion is to stop playing the game. Forget about your indicators, stop reviewing manuscripts for the big publishing companies without compensation, and stop submitting your manuscripts to journals run by the big publishing companies. If we just continue as before without leaving our comfort zone, it is very unlikely that anything will change. Question everything about the research system and draw your own conclusions. If we all try that, I am sure change will come.

My second suggestion is to develop in a collaborative effort an alternative publishing model based on the Internet. I have a pretty clear vision of what I would consider an ideal publishing model and I think I might write a separate blog post about this topic in the near future. But it is important that we shape the system in a way which suits all of us.

The last suggestion is simply to create a union for researchers. In this way we can organize ourselves and obtain some political power in order to change things as project funding or publishing policies on the political level. The structures that are imposed on us are not the structures we want and maybe we can change certain things in this way.

My personal research policies

In order to give you some examples of how we could change our own behavior regarding the current research system, I would like to discuss some changes of my own research policies.

As I said, for the moment I will try to continue in academia. Unfortunately, I am not in a position to drastically change policies on a higher level. But there are some small things which I can do (and which basically everyone could do). First of all, I will not participate anymore in the peer reviewing process under the current conditions. Well, peer reviewing gives me a certain power over my peers and in theory I could use it for the overall good. But this is exactly one of the many problems with peer reviewing. It is a non-transparent system which empowers the reviewers over their peers. You might say that my behavior is not fair, because other researchers will still review my manuscripts. I can assure you that this will not be the case, because I also plan to publish my research results in an alternative way. I will put them on the Internet, freely available for everyone and open to discourse. This does not mean that I believe that my work is free from defects and does not need any sort of reviewing. If there is something wrong with my papers or there are reasons to change parts of them, I will be more than happy to do it.


The current research system is fundamentally flawed. The different problems are extremely intertwined and facilitate an environment which is diametrically opposed to conducting good research. It will require our collaborative effort to change things. This is definitely not an easy task, but I still hope it is somehow possible. Please let me know (by email, by twitter @dennisweyland, or by commenting on this blog post) what you think about this whole issue.

The Harmony Search Algorithm – My personal experience with this “novel” metaheuristic

In this blog post I would like to talk about the harmony search algorithm. I have written two critical journal articles regarding this heuristic, which have received wide approval in the research community, but also triggered harsh responses by the inventor of this optimization approach, Zong Woo Geem. I will present my personal experience with the harmony search algorithm in chronological order, including the two journal articles, the response by the inventor of harmony search and other related events. At the end of this blog post I will try to detect the underlying causes that lead to such things as the harmony search algorithm. From my personal point of view these causes are rooted in our current research system which seems to be fundamentally flawed and gives false incentives to researchers and all the other parties that are involved in the research process.

My first encounter with harmony search

I have encountered the harmony search algorithm for the first time in a conference talk in 2009 (J. Greblicki and J. Kotowski. Analysis of the properties of the harmony search algorithm carried out on the one dimensional binary knapsack problem. In Computer Aided Systems Theory, Eurocast 2009, pages 697-704). Despite the unusual analogy to jazz music, the approach looked to me like a simple evolutionary algorithm. Nothing sophisticated, no features like self-adaption or others, just very simple mutation, recombination and selection operators. This was one of my first conferences and at that time I was seriously shocked.  Now, after being in academia for a few years, I would be surprised if the proportion of publications which have actually a non-negative value would be more than 5%. But at that time I was really shocked. I thought that such a research work is among the exceptions and I tried to ignore that issue.

In the following months and years I was confronted with the harmony search algorithm over and over again. It seemed to me that there was an extreme quantity of publications regarding so-called novel heuristics based on the most absurd metaphors: the intelligent water drops algorithm, monkey search, cuckoo search, the bat algorithm, the galaxy-based search algorithm, quantum evolutionary algorithms, weed colony optimization, the great salmon run optimization algorithm, the wisdom of crowds algorithm, the shuffled frog leaping algorithm, cat swarm optimization. I could no longer ignore this issue and decided to take a closer look at all these presumably new approaches. My initial plan was to write a very general article about all of these methods. What I did not realize at that time was that just the vast amount of these approaches would make such a project impossible. In any case, I started with the harmony search algorithm and during my investigations I was discovering so many flaws with just this specific algorithm that I finally decided to write an article focusing only on harmony search.

My first article

What I figured out was that harmony search is indeed a special case of evolution strategies, using very simple selection, mutation and recombination operators that were known for a long time. More than 30 years after the introduction of evolution strategies, the mathematically identical algorithm was proposed using a different terminology, namely that of jazz music. Definitely nothing worth of 586 publications (according to a google scholar search in 2010 for “Harmony Search”, including the quotation marks). I have published these results with a slightly informal, but nevertheless correct and accurate, proof in a journal article (D. Weyland. A rigorous analysis of the harmony search algorithm: How the research community can be misled by a “novel” methodology. International Journal of Applied Metaheuristic Computing 1 (2), pages 50-60). In the same article I have also investigated several publications related to harmony search, mainly the top hits from the google scholar search that were freely accessible for me. Apart from that I gave arguments for the fact that I do not expect any real novelties from the analogy to jazz music in the context of optimization and that “research in harmony search is fundamentally misguided and that future research effort could better be devoted to more promising areas”.

The response to my first article

I thought that for me the whole harmony search case would be closed after that. But I was completely wrong. Shortly after my article got published, the “inventor” of harmony search, Zong Woo Geem, responded by publishing a research commentary (Z.W. Geem. Research commentary: Survival of the fittest algorithm or the novelest algorithm? International Journal of Applied Metaheuristic Computing 1 (4), pages 75-79). I can just recommend to everyone to read this research commentary, written by someone we consider to be among our peers in the research community. In any case, I would like to discuss a few of his statements in the following. If this discussion will get too boring at some point, I would suggest to directly skip to the next section.

Let me start with the abstract. The first misunderstanding is already revealed in the first sentence of the abstract, where the Geem writes that I have claimed in my article that “harmony search is equivalent to evolution strategies, and because the latter is not popular currently, the former has no future”. What I have done is different: I have proven that harmony search is a special case of evolution strategies. It is not just a claim, it is a statement which has been carefully proven. And more important, the relationship between harmony search and evolution strategies is no longer negotiable. The results have to be accepted based on fundamental scientific principles, even if it is contrary to one’s own opinion and beliefs. Apart from that I have never argued that harmony search has “no future” because evolution strategies are currently not popular. In fact, I never made any statements about the popularity of evolutionary algorithms or evolution strategies. My argumentation follows a different scheme: (1) Harmony search is a special case of evoluation strategies. (2) In particular, these evolution strategies use common selection, mutation and recombination operators and are therefore representatives of this algorithm family. (3) Harmony search can never be more efficient than the best evolution strategy. (4) Research in harmony search explores already well traveled paths which is a huge waste of resources. (5) In a whole decade the harmony search community did not discover any new principles or mechanisms in the field of heuristics. (6) The only novel concept of harmony search is its metaphor of jazz music. For obvious reasons the underlying metaphor is not a relevant criterion for the success of heuristics. (7) I personally do not see any potential to discover new principles or mechanisms in the field of heuristics based on the metaphor of jazz music. (8) Based on all these observations I have claimed that “research in harmony search is fundamentally misguided and that future research effort could better be devoted to more promising areas”. It seems that Geem completely misunderstood crucial parts of my article and this becomes evident already in the first sentence of the abstract.

Geem then continues to explain the purpose of his research commentary which is “to rebut the original paper’s claims by saying (1) harmony search is different from evolution strategies because each has its own uniqueness, (2) performance, rather than novelty, is an algorithm’s survival factor, and (3) the original paper was biased to mislead into a predefined conclusion”. It is easy to see that Geem can not succeed with his first goal, since I have already proven that harmony search is a special case of evolution strategies. Apart from the underlying metaphor of jazz music, the resulting algorithm of harmony search does not differ from a common subclass of evolution strategies. With his second goal I could almost agree, although I think performance is not the only important criterion for the success of an algorithm and a novel approach may be interesting just for its novelty. In any case, harmony search is neither a novel method nor does it outperform other heuristics and therefore the second goal seems to be irrelevant in this context. As I have said before, as a special case of evolution strategies, the only novelty of the harmony search algorithm is its metaphor of jazz music, which is not relevant from an algorithmic point of view. For the same reason, because harmony search is a special case of evolution strategies, it can not outperform the best evolution strategy. In other words, the best evolution strategy is always at least as good as the harmony search algorithm.

Almost every single sentence of the research commentary is worth to be discussed like I have done it for the abstract, but I think this would go way too far. Instead, I will discuss the most crucial statements from a higher level in the remaining part of this section.

Let me begin with a point that I have already raised in the discussion of the abstract. Throughout the research commentary Geem accuses me to claim that harmony search is equivalent to evolution strategies (“I cannot imagine how the author concluded that HS equals ES with these different problem set and algorithm structure.” / “The author claims that because both HS and ES use solution population and produce one solution per iteration, HS is equivalent to ES if the latter is tweaked.” / “If ES really equals HS, it should be more popular in major literature because of its performance regardless of its novelty.” / “Also, if ES equals HS, GA also equals HS after tweaking GA (uniform crossover, neighboring mutation, multiple parents (polygamy?), etc). But, why did the author insist ES = HS rather than GA = HS?'” / “His logic is HS equals ES…”). Once again, I did not claim that harmony search and evolution strategies are equivalent. I claimed and proved that harmony search is a special case of evolution strategies. In my article I did not explicitly state my result as a theorem followed by a proof, but the logic of the article followed exactly this scheme and it is sad that Geem is not able to understand that. In fact, he writes “His logic is … without any rigorous analysis such as theoretical proof or numerical example.” and “… he should use academic way (mathematical proof or numerical results)…”.

One important point that Geem raises throughout his research commentary is that the success of an algorithm depends highly on its performance (“Again, any algorithm, which performs better, is naturally selected by the research community…” / “Rather, its performance such as solution quality and computational effort is the key factor to accept.” / “I believe HS was selected because it has performed better than other algorithms instead of novelty.”). I completely agree that performance is a very important property of an algorithm or a heuristic. But since I have shown that harmony search behaves in the same way as a common evolution strategy, it is theoretically not possible that it beats evolution strategies. Applying the best possible evolution strategy to a problem should always lead to the same or even better results than applying the best harmony search algorithm. Geem also asks “Why is ES not popular in one of my research areas?’‘. Assuming that harmony search performs well, the same or even better performance can be expected from evolution strategies. So it is indeed puzzling why evolution strategies or other methods are not that popular in this research area.

Another issue raised by Geem is that not he, but I am misleading the research community. The first part of my article contains a formal proof of the relationship between harmony search and evolution strategies and I do not see how this can be misleading in any way. I have then discussed certain publications about harmony search more in detail and the author complains several times about my selection of these publications. I have carefully explained in my work how the publications were selected for the discussion. I selected those publications that were found by a google scholar search and that were freely available. It might be that this selection of publications is not representative for the whole field of harmony search, but I seriously doubt that. In any case, this was not any form of biased selection one could complain about. I can also not understand how the author tries to defend the experimental setup used in many of the harmony search publications. Several issues that are fundamental to conduct meaningful experiments are just not respected and therefore the conclusions drawn from these experiments are of very limited significance.

It is also quite funny that Geem honestly exhibits his lack of knowledge (“Although I do not know much about ES…” / “Most importantly, when I searched Wikipedia, I could not find the structure (mu + 1)-ES …” / “Maybe (mu + 1)-ES is possible, but appears not popular.”). At some point he states that “ES was developed to solve continuous-valued problem … while HS was originally developed to solve discrete-valued problem…” and uses this an argument against my mathematically proven results. It is quite interesting to note that some of the very first applications of evolution strategies indeed dealt with discrete-valued problems (I. Rechenberg. Evolutionsstrategie: Optimierung technischer Systeme nach Prinzipien der biologischen Evolution. 1973). This was around 30 years before harmony search has been proposed, at a time where optimization was performed without the help of computers and where Galton boxes were used to create discrete random values. In the same publication the above mentioned (mu + 1)-ES has been discussed. It also seems that the author was consulting literature to find a match between harmony search and evolution strategies (“In other words, I never found any exact match between ES and HS with respect to the algorithm structure, …”). This is quite confusing, considering the fact, that I have discussed the relationship of harmony search and evolution strategies in detail in my article and that independently such a discussion has also been performed in another article (M. Padberg. Harmony search algorithms for binary optimization problems. In Operations Research Proceedings 2011, pages 343–348). All in all, I am very grateful to the author of the research commentary for being honest about his lack of knowledge in the field of optimization and about demonstrating that he does not possess the prerequisites to conduct research in this field.

At one point Geem himself suggests that harmony search might also be a special case of genetic algorithms (although he formulates it slightly different: “Also, if ES equals HS, GA also equals HS after tweaking GA (uniform crossover, neighboring mutation, multiple parents (polygamy?), etc).” I think it might be the case that harmony search is also a special case of genetic algorithms (which would reveal the not very surprising and probably known fact that the intersection of evolution strategies and genetic algorithms is not empty). The author then continues with the following two questions: “But, why did the author insist ES = HS rather than GA = HS? Is this because GA is still popular while ES is not?” Apart from the fact that I did not prove equality of the two methods, I do not know why I selected evolution strategies and not genetic algorithms. It was definitely not because of the popularity of any of these two methods. I am quite sure that it is possible to prove that harmony search also falls into the genetic algorithms framework, which would even strengthen the argumentation of my previous article. I would be glad to publish such a proof, if this would be received by Geem in a more scientific way than it was the case with my first article.

It is also interesting to note that Geem attacks me personally and insults me in his research commentary (“… instead of being rejected by an unverified author who pretends to be an Ultimate Being.” / “I guess that the author was pressed to publish a paper using predefined conclusion and biased bases.” / “In this paper, I tried to briefly demonstrate how fictitiously the author wrote a ‘novel’.”. He also calls my article a “hoax paper” and tries several times to instruct me (“However, if the author wants to write a real paper, his logic should be…” / “If the author really wants to criticize a theory, he should use academic way (mathematical proof or numerical results) rather than malediction with predefined conclusion and biased supports. Only in that way, the real chance for the author to write a good paper is given.”). The whole research commentary is also full of absurd and non-scientific statements (“I’d like to rebut this musically. People may easily tell Schöberg’s from Haydn’s. However, sometimes people may not be able to tell Haydn’s from Mozart’s because they share similarity (e.g. Sonata structure). Or, religiously speaking, people may hardly tell the difference between two denominations under Christianity. If we tweak the liturgy of Denomination A, it may become that of Denomination B. In this case, can we say A equals B? (If someone is ecumenical, (s)he may say so though).” / “With an analogy in music, every day so many musicians produce numerous ‘novel’ music pieces all over the world. But only a few go to a music chart based on audience’ preference rather than novelty.” / “Even if any limitation of HS is found in the future, I do not think it is a deadlock because I believe the maxim ‘where there is a will (of overcoming the limitation), there is a way (to solve).’ I imagine this overcoming process is similar to a breakthrough moment of local optimum.” / “My last wish is that the author becomes Paul rather than Saul of HS, or that he becomes the president of a new algorithm rather than the sniper of the existing algorithm (I do not want to mention another academic vandalism by the IP which locates the author’s region).” / “Also, I think the author’s research patron (Swiss National Science Foundation) would be much happier if the author wrote a more constructive paper rather than publication-oriented paper under the financial benefit and time duration.”).

If you want to take a short break from your research work, try to read his research commentary. It will make you sad, but it will also make you laugh at the same time.

My second article

I wonder if anybody managed to read the whole last section. To me it seems quite bizarre that the author of the research commentary is among our peers in the research community and it is hard for me to imagine to come up with different feelings after reading the research commentary.  In any case, the response in the form of the research commentary and the ever growing number of publications regarding harmony search (more than 9000 according to a google scholar search at the end of 2014 for “Harmony Search”, including the quotation marks) forced me to react. For this purpose I wanted to study some harmony search publications more in detail and to provide a formal proof of the result from my first article. This led to a second journal article (D. Weyland. A critical view on the harmony search algorithm – How to not solve sudoku. Operations Research Perspectives 2, pages 97-105).

Putting the proof in a more formal framework was not a big deal. But my study of a certain harmony search article (Z.W. Geem. Harmony search algorithm for solving sudoku. In Knowledge-Based Intelligent Information and Engineering Systems, pages 371–378) written by the inventor of harmony search, Zong Woo Geem, exposed some shocking flaws. The article by Geem is basically about using the harmony search algorithm for solving the sudoku puzzle. Sudoku is a kind of puzzle game you can find, for example, in newspapers. You have given a 9×9 square, partially filled with numbers in the range from 1 to 9. The goal is to fill in the missing numbers, such that in each row, in each column and in each of the nine 3×3 subsquares (into which the big square can be uniquely partitioned) the numbers from 1 to 9 occur each exactly once. There exist many problem specific algorithms which can solve even the hardest of these puzzles in quite a short amount of time. So the first question that arises is, why would we even try to solve this problem with a heuristic? This is simply a conceptual flaw. But for the moment we can easily ignore this fact, since things are getting worse pretty soon.

Let us have a look at the objective function Geem uses in his work. For a potential solution to the sudoku puzzle, he sums the absolute differences of 45 and the sums of values in each row, in each column and in each subsquare. Since the sum of the values from 1 to 9 is 45, it is clear that the unique solution to the sudoku puzzle has an objective value of 0 and is therefore an optimal solution with respect to the given objective function. Unfortunately, it is not obvious whether there are other optimal solutions with respect to the given objective function or not. In his article Geem states the following:

It should be noted that, although the sum of each row, each column, or each block equals 45, it does not guarantee that the numbers 1 through 9 are used exactly once. However, any violation of the uniqueness affects other row, column, or block which contains the wrong value jointly.

This statement itself is true, but does it really explain that there is only one optimal solution with respect to the given objective function? It turns out that this is not true in general. In fact, we can easily derive, starting from the unique solution of the sudoku puzzle, additional solutions with the optimal objective value of 0. Just take any 2×2 subsquare which does not contain numbers given in the description of the puzzle and which is completely embedded in one of the 3×3 subsquares. Increase the numbers in the top left and bottom right position of the 2×2 subsquare by 1 and decrease the numbers in the top right and bottom left position of the 2×2 subsquare by 1 (If this is not possible exchange the increase and decrease operators). In this way the sums of the values in the affected rows, columns and the affected 3×3 subsquare do not change. The resulting solution is still optimal with respect to the objective function proposed by Geem and very likely there is a huge number of such optimal solutions which are not the unique solution to the sudoku puzzle. This is a serious flaw. It seems that Geem was not aware of this issue. He often uses terms as “the optimal solution” or “the global optimum”, which both imply uniqueness of the solution with respect to the given objective function. It is therefore also not clear if the results reported in his article are with respect to finding the unique solution of the sudoku puzzle or with respect to finding any of the optimal solutions.

I have then tried to reproduce the results of the computational experiments that were performed by Geem. I quickly reimplemented the harmony search algorithm and run it on the sudoku instance used by Geem. I did not expect to match his results exactly, since there is some stochasticity involved in the harmony search algorithm, but at least I expected to match the order of magnitude of his results. It turned out that this was not possible at all. Geem reported that for most of the parameter settings of the harmony search algorithm between 100 and 1,000 iterations were necessary to find the optimal solution. I performed 20 independent runs for each parameter settings and never found any optimal solution within 10,000 iterations or the unique solution to the puzzle within 1,000,000 iterations. How can we explain this discrepancy of the results? Maybe some theoretical considerations can help us here.

The harmony search algorithm for the sudoku puzzle is quite simple and therefore it is possible to derive bounds for the probability that the unique solution of the sudoku puzzle is found within a certain number of iterations. For this purpose we calculate the probability that the unique solution is found during the initialization process and a bound for the probability that the unique solution is found within one iteration of the algorithm. The first probability turns out to be smaller than 10^{-37}. The second probability depends heavily on the parameters of the harmony search algorithm. With a harmony memory consideration rate of 0.5 and a pitch adjustment rate of 0.1, for example, the probability to find the unique solution within 10,000 iterations is bounded from above by 5.2 \cdot 10^{-8}. Putting things together, we can even compute the probability that all the 12 runs from Geem’s article with a harmony memory consideration rate of 0.5 would find the unique solution to the harmony search problem within the maximum of 10,000 iterations. In the results of Geem the unique solution was always found within the 10,000 iterations and the exact number of iterations until the unique solution was found was rather low in all of these runs. This probability is bounded from above by 4.83 \cdot 10^{-100}. This is an astronomically small number. To get a feeling for how small this probability really is, I would like to refer to the number of atoms in the visible universe, which is commonly estimated to be around 10^{80}.

This theoretical result is really shocking. And at this point it is difficult to not suspect some sort of deliberate cheating by Geem.


What conclusions can we draw from this whole affair? There is definitely some problem in the field of heuristics. We are overrun by a vast amount of presumably novel heuristics based on increasingly absurd metaphors which in some cases have nothing in common with optimization anymore. This issue is thoroughly discussed in a 2013 article by Kenneth Sorensen (K. Sorensen. Metaheuristics – the metaphor exposed. International Transactions in Operational Research, 2013). In my two journal articles and in this blog post I have shown that the harmony search algorithm is nothing more than a fraud. And I suspect that the situation is similar for many of the other presumably novel metaphor-based heuristics.

But in my opinion the rather sad developments in the field of heuristics are completely understandable. We are operating in a fundamentally flawed research system with a lot of financial interests and many false incentives. From this point of view the work by Geem seems misguided, but somehow reasonable. It is therefore not sufficient to criticize him and others for fraudulent and unethical behavior and to leave aside the much bigger problems of our research system. Peer-reviewing in its current form is just not working. The pressure to publish in order to advance your career or in order to obtain grants is immense and leads to wrong incentives. Financial interests of third parties which are not directly involved in the research process (e.g. publishers) are further corrupting the research system. I guess I will write a blog post about this topic in the near future, but what I want to say at this point is the following: The behavior of Geem is not justifiable at all and definitely affects research in heuristics in a negative way. This is a serious issue, but at the same time we should also think about the bigger picture.

On the more constructive side, I would propose to treat heuristics from a purely mathematical and technical way. Metaphors and analogies might (or might not) help in the construction of heuristics, but the resulting methods should be treated from a mathematical and technical point of view. I have recently submitted an article in which I suggest to see heuristics as methods that iteratively sample and evaluate solutions according to some probability distribution which is regularly updated based on the results of the sampling process. A similar statement has been made in the context of the 11th Metaheuristics International Conference, MIC 2015, in a document with the title “A Research Agenda for Metaheuristic Standardization”. Based on a solid mathematical and theoretical fundament, it is then possible to develop and compare different heuristics in a meaningful way. And this is where I personally see the future of research in the field of heuristics. I am currently working on a metaphor-free heuristic which can make use of the latest technological achievements in the form of cloud computing and GPU computing. More about that soon in another blog post.

The relativity of philosophy, a rule-based Universe and free will

An axiomatization of philosophy

Many philosophical publications deal with absolute questions and potential answers to such questions. This is likewise true for ancient and modern pieces of philosophy. To me this seems quite surprising, considering the fact that we are part of a system which we do not fully understand. Of course, it could make sense to formulate and answer absolute questions, providing that we possess sufficient and correct information about all relevant aspects related to such a question and that we are in addition certain about the correctness of the available information. Questions like “Is Switzerland a country in Europe?” or “Did it rain in London on August 23, 2015?”, with the conventional meanings for these words and dates, are perfectly legitimate, while this is not true for questions like “Do humans have free will?”, “Is there a meaning of life?” or “What caused the Big Bang?”. We simply do not possess the information which are required to answer the latter questions and it is not clear if we will ever be able to obtain sufficient information. Nevertheless, it is possible to relativize those questions based on certain assumptions. We could ask “Do humans have a free will based on the assumption that the Roman Catholic view of the Universe (whatever that means) is correct?” or “Do humans have a free will based on the assumption that quantum theory is an adequate model for the behavior of the Universe?”. Those assumptions are similar to the axioms used in mathematics and on top of them we are able to construct a certain kind of philosophy. Without explicit assumptions given, we can only expect that the reasoning is based on the subjective understanding of the Universe by some person or the person’s social environment at the time of writing.

I have recently read the book “Hirnforschung und Willensfreiheit” (could be translated as “Brain research and free will”) containing several essays written by philosophers, theologians, historians, literary scholars and criminal law experts regarding brain research and free will. It is evident that the essays are based on completely different subjective models of the Universe. This leads to an impressive amount of confusion among the different authors and apparently to diverging answers regarding the free will of humans. It is not surprising at all, that the answer to such an absolute question about the free will of humans is different for someone who reasons based on the Roman Catholic view of the Universe and for someone else who reasons based on the latest physical theories. Not the answers are wrong, but the absolute form of the question is the actual problem, at least as long as we are not able to give an absolute answer.

Another absolute question is “What is the Universe made of?”. This question has puzzled humans since the beginnings of philosophy. The answer is that we simply do not know what the Universe is made of. In ancient times some people believed that the Universe is made of the five elements earth, water, air, fire and aether, while some others, following Leucippus and Democritus, believed that the Universe consists of indivisible atoms and empty space between them. To me personally the ideas of those times seem like wild guesses. Nowadays we despise the idea that the Universe is made of five elements, while we praise the ingenious foresight of Leucippus and Democritus, because their ideas are more or less consistent with our modern physical theories. But the point is that we do not know what the Universe is made of. How can we praise one wild guess from ancient times, while we despise other wild guesses? Who knows whether the next physical theory is consistent with the idea of elementary particles or not. Maybe we will realize at a certain point that some continuous physical model is a better description of the Universe. Again, the absolute form of the question is the problem. And it will remain a problem as long as we are not able to give an absolute answer. And who knows if this is possible or if it will ever happen.

Meanwhile, I would suggest to relativize such absolute questions. Making certain assumptions gives us a solid fundament to base our philosophy on. In this way we could avoid the confusion caused by the implicit assumptions based on the different subjective understandings of the Universe. Of course, different sets of assumptions are legitimate as long as they are consistent with the information we possess about the Universe. But within a fixed set of assumptions a healthy discourse could evolve.

A rule-based Universe

In the remaining part of this blog entry I would like to discuss a certain set of assumptions and what these assumptions would imply for the free will of humans. In fact, my only assumption is that the behavior of the Universe strictly follows certain rules. With this very vague definition I mean that the Universe evolves from a given state in a certain way according to some laws. I would like to add that these laws might even be of stochastic nature, which means that the Universe might not be necessarily deterministic. We do not know if this assumption is correct and I do not want to argue about that at this point. Instead, I would like to discuss if this assumption is plausible, that means if it is consistent with our experiences and observations of the Universe. Afterwards, I will focus on the implications of this assumption for the free will of humans.

It is not possible to prove that the assumption is true. But what does it mean that the assumption is plausible? By plausible I mean that there is no compelling evidence for observations which are contrary to that assumption. In my opinion the assumption is indeed plausible. I am not a theoretical physicist, but according to my understanding of physics the assumption is consistent with modern physical theories. In fact, the assumption seems to be the basis of physical models. I also think that most of our observations and experiences of the Universe are consistent with that assumption. Galaxies, stars, planets and intelligent life, all these things could have been evolved in such a rule-based Universe. Only our subjective perception as an individual seems to contradict the assumption. But our subjective perception has turned out to be wrong in many situations and I could imagine that this perception itself has even been evolved in such a rule-based Universe. While this is not a full proof of the plausibility, I am personally not aware of any compelling evidence against the assumption. If any such evidence will be found, the following discussion of the implications of the assumption on the free will of humans will turn out to be obsolete.

Implications on free will

So what are the implications of a Universe that follows certain rules for the free will of humans? To answer this question we first have to define what we actually mean by free will. Free will is the ability of a human to choose between different possible courses of action. In some situations it seems that we have on a certain level of abstraction the choice between different options. On a hiking trail we might arrive at a fork where we could go either left or right. With free will I do not mean the mere possibility of these options on a certain level of abstraction. I also do not mean that there is the actual possibility in a stochastic sense. With free will I mean that a certain mechanism belonging to an individual is able to actively chose one of the available options. Having to decide between several options, this mechanism is the solely cause of the decision. Obviously, this is not compatible with our initial assumption, since this mechanism is overwriting the rules that determine the behavior of the Universe. On the other hand, this is in stark contrast to our subjective feelings which suggest that we can actively perform decisions which are not limited by any rules. But based on our assumption, this kind of free will has to be an illusion. Our intuition turned out to be wrong many times throughout the recorded history, so it would not be that surprising if free will would actually be an illusion.

Are there any other reasons against this kind of free will, maybe some which are rooted as well in our subjective perception? There are. Well, at least there are questions which we have to adequately address in order to allow something like free will. We humans do not attribute the same kind of free will, which holds for us according to our subjective perception, to other forms of life. The border which separates forms of life that possess free will and those that do not possess free will may vary for different people, but the point is that we can imagine for all forms of life situations in which on some level of abstraction different actions are possible. We then observe that these forms of life perform exactly one action. This is exactly the same that we observe in other human beings. One question we have to answer is that if those forms of life do not have a free will, why would we proclaim a free will for us, although we seem to behave in the same way? Another interesting question is connected with the growth of humans. Up to a certain development of a human being (which might even be during the growth of the embryo) we do not observe any sort of free will. The question we have to answer is why should this change at a later stage? The last question I would like to present here is related to the theory of evolution. This theory is compatible with our assumption and so far there is no compelling evidence against this theory. If evolution is a valid description for the development of complex forms of life starting from extremely simple organisms and if we do not attribute free will to such simple organisms, then free will is a product of evolution. The questions we have to answer are at which point during the evolutionary process did free will evolve and why would free will be better from an evolutionary point of view than the illusion of free will?

So do we humans have a free will or not? We do not know. But what we can say with certainty is that the concept of free will, which has been used above, is not compatible with our assumption of a rule-based Universe.

Blog Launched

Finally, I have managed to add a blog to my personal website. This has been planned for quite a while and in fact, one of the reasons why I have created that website was to present my ideas and views to the public.

As a researcher I am of course interested in science. I have studied computer science and mathematics and luckily I am still fascinated by these two fields. But my interests are not limited to those areas, I am also extremely interested in other topics such as politics, philosophy, physics, social problems, economical problems, justice, music, movies, the research system, and many more.

This blog will therefore contain posts to a huge variety of topics. The overall goal is to present my personal ideas and views on various matters, and to initiate discussions on those topics.