Brainfish Eat Fishbrain

Developing for Python & Flask & SQLAlchemy & UWSGI

· Tycho

When you ask a question on StackOverflow or somewhere else about programming, you are looking for a solution which you can quickly use, or copy/paste etc. However, often you also need an explanation of the root cause of the issue, otherwise you are bound to make the same mistake/bug time after time in subtle different ways. 

When I first used Java Servlets (end of the 1990s) after many years of cgi-bin (not fastcgi), I was creating an application for a local municipal (which explains why we were using Java). I was working with a colleague in the office and the Java application ran on a local server in the office; we were deploying new Servlet JAR files all the time and testing them. 

After a bit we started noticing that, after logging into the application, we saw eachothers data and session information. This was pretty worrying and we didn’t understand why. When checking books (we didn’t have many) or online forums (there weren’t many), we always got answers that showed us examples of apps they made. But not why our application failed like this. 

It is 25 years later, and this still happens. Because of the uptick in AI/ML, we have to work with Python more and more as it is the language of choice in the ML world. Many people, including we, are going from Jupyter notebook to production server quite rapidly and colleagues/people we contract mostly have experience working locally in notebooks and then running a test Flask server on this code. 

One of the more common errors we see when an application is first deployed on a test server with UWSGI are DB transaction errors which are ‘unexplained’ looking at the code and cannot be reproduced locally by the developers. The error would look something like this;

sqlalchemy.exc.PendingRollbackError: Can't reconnect until invalid transaction is rolled back.

When trying to find answers to this issue, it becomes apparent that people have this issue more often, but the solutions are quite diverse, many recommending to set uwsgi lazy_apps to true, using connection pools, using @post hooks for uwsgi etc.

The actual root cause however is the same as 25 years ago with Servlets; you should not use global variables in your code when running py applications in uwsgi as they won’t work as you expect, probably. Unfortunately no-one has that as simply, one line answer, and, AI/ML notebook code is often completely stuffed with globals.

To save on resources, the runtime uses tricks to reuse references and so it doesn’t have to re-import and reload the modules. So when you do something with globals, like create a database session, something many people would do in notebooks and small local projects, it shares that global content between the forks (or threads in Serviërs and other frameworks). It even shares file descriptors like the database connection, which gives the rollback errors.

To summarize:

Do not use globals in your code

Always good advise anyway!