This is a section from the open-source living textbook Better Code, Better Science, which is being released in sections on Substack. The entire book can be accessed here and the Github repository is here. This material is released under CC-BY-NC.
Code formatting tools
Writing standards-compliant and well-formatted code requires a deep knowledge of the relevant standards, which in the context of Python is primarily contained in the Style Guide for Python Code, also known as *PEP8*. I'd venture a guess that very few coders have actually sat down and read PEP8. Instead, many of us have learned Python style implicitly by looking at others' code, but increasingly we resort to the use of automated static analysis tools, which can identify potential errors and reformat code without actually executing the code. These tools are commonly known as linters, after the `lint` static analysis tool used in the C language. There are numerous such tools for Python; for our examples we will use the ruff
formatter, which has become popular due in part to its speed.
A very useful feature of static analysis tools like ruff
is that they can easily be integrated into most IDEs, so that they can flag problems in the code as it is written. In addition, most modern IDEs will automatically suggest changes to improve the formatting of code.
Let's start with writing some poorly formatted code, using the VSCode IDE. Here is a screenshot after adding a couple of lines of poorly formatted or anti-pattern code:
Figure 1. IDE suggestions to fix poorly formatted or problematic code within VSCode. The top panel shows two lines of problematic code. The squiggly underlines reflect ruff
's detection of problems in the code, which are detailed in the popup window as well as the Problems panel below. The IDE is also auto-suggesting a fix to the poorly formatted code on line 8.
We see that ruff
detects both formatting problems (such as the lack of spaces in the code) as well as problematic code patterns (such as the use of star-imports). We can also use ruff
from the command line to detect and fix code problems :
❯ ruff check src/BetterCodeBetterScience/formatting_example.py
src/BetterCodeBetterScience/formatting_example.py:6:1: F403 `from numpy.random import *` used; unable to detect undefined names
|
4 | # Poorly formatted code for linting example
5 |
6 | from numpy.random import *
| ^^^^^^^^^^^^^^^^^^^^^^^^^^ F403
7 |
8 | mynum=randint(0,100)
|
src/BetterCodeBetterScience/formatting_example.py:8:7: F405 `randint` may be undefined, or defined from star imports
|
6 | from numpy.random import *
7 |
8 | mynum=randint(0,100)
| ^^^^^^^ F405
|
Found 2 errors.
Most linters can also automatically fix the issues that they detect in the code. ruff
modifies the file in place, so we will first create a copy (so that our original remains intact) and then run the formatter on that copy:
❯ cp src/BetterCodeBetterScience/formatting_example.py src/BetterCodeBetterScience/formatting_example_ruff.py
❯ ruff format src/BetterCodeBetterScience/formatting_example_ruff.py
1 file reformatted
❯ diff src/BetterCodeBetterScience/formatting_example.py src/BetterCodeBetterScience/formatting_example_ruff.py
1,3d0
<
<
<
8c5
< mynum=randint(0,100)
\ No newline at end of file
---
> mynum = randint(0, 100)
The diff
result shows that ruff
reformatted the code on line 8 (to add spaces in compliance with PEP8) and also removed some empty lines in the file. It did not, however, change the import statement; that's a level of modification that is beyond the power of a static analysis tool.
Formatting code using AI agents
Unsurprisingly, AI coding agents are also quite good at fixing formatting and styling issues in code. Here is a simple example where we prompt the GitHub Copilot chat within VSCode to fix the formatting in the example code from above:
Figure 2. The GitHub Copilot chat within VSCode was used to prompt the model (Claude Sonnet 4) to fix issues with the code. The model generated new code and also outlined its improvements.
The agent both addressed the formatting issues as well as fixing the wildcard import, improving the variable naming, and updating the comment. Here is the new code generated by the model:
# Well-formatted code following PEP8 and best practices
import numpy.random as np_random
my_num = np_random.randint(0, 100)
The model could certainly be prompted to make more extensive changes on more complex code (such as improving variable names) with a more detailed prompt.