I’m used to the Java model where you can have one public class per file. Python doesn’t have this restriction, and I’m wondering what’s the best practice for organizing classes.
A Python file is called a “module” and it’s one way to organize your software so that it makes “sense”. Another is a directory, called a “package”.
A module is a distinct thing that may have one or two dozen closely-related classes. The trick is that a module is something you’ll import, and you need that import to be perfectly sensible to people who will read, maintain and extend your software.
The rule is this: a module is the unit of reuse.
You can’t easily reuse a single class. You should be able to reuse a module without any difficulties. Everything in your library (and everything you download and add) is either a module or a package of modules.
For example, you’re working on something that reads spreadsheets, does some calculations and loads the results into a database. What do you want your main program to look like?
from ssReader import Reader from theCalcs import ACalc, AnotherCalc from theDB import Loader def main( sourceFileName ): rdr= Reader( sourceFileName ) c1= ACalc( options ) c2= AnotherCalc( options ) ldr= Loader( parameters ) for myObj in rdr.readAll(): c1.thisOp( myObj ) c2.thatOp( myObj ) ldr.laod( myObj )
Think of the import as the way to organize your code in concepts or chunks. Exactly how many classes are in each import doesn’t matter. What matters is the overall organization that you’re portraying with your
Since there is no artificial limit, it really depends on what’s comprehensible. If you have a bunch of fairly short, simple classes that are logically grouped together, toss in a bunch of ’em. If you have big, complex classes or classes that don’t make sense as a group, go one file per class. Or pick something in between. Refactor as things change.
It entirely depends on how big the project is, how long the classes are, if they will be used from other files and so on.
For example I quite often use a series of classes for data-abstraction – so I may have 4 or 5 classes that may only be 1 line long (
class SomeData: pass).
It would be stupid to split each of these into separate files – but since they may be used from different files, putting all these in a separate
data_model.py file would make sense, so I can do
from mypackage.data_model import SomeData, SomeSubData
If you have a class with lots of code in it, maybe with some functions only it uses, it would be a good idea to split this class and the helper-functions into a separate file.
You should structure them so you do
from mypackage.database.schema import MyModel, not
from mypackage.email.errors import MyDatabaseModel – if where you are importing things from make sense, and the files aren’t tens of thousands of lines long, you have organised it correctly.
The Python Modules documentation has some useful information on organising packages.
I happen to like the Java model for the following reason. Placing each class in an individual file promotes reuse by making classes easier to see when browsing the source code. If you have a bunch of classes grouped into a single file, it may not be obvious to other developers that there are classes there that can be reused simply by browsing the project’s directory structure. Thus, if you think that your class can possibly be reused, I would put it in its own file.
I find myself splitting things up when I get annoyed with the bigness of files and when the desirable structure of relatedness starts to emerge naturally. Often these two stages seem to coincide.
It can be very annoying if you split things up too early, because you start to realise that a totally different ordering of structure is required.
On the other hand, when any .java or .py file is getting to more than about 700 lines I start to get annoyed constantly trying to remember where “that particular bit” is.
With Python/Jython circular dependency of import statements also seems to play a role: if you try to split too many cooperating basic building blocks into separate files this “restriction”/”imperfection” of the language seems to force you to group things, perhaps in rather a sensible way.
As to splitting into packages, I don’t really know, but I’d say probably the same rule of annoyance and emergence of happy structure works at all levels of modularity.
I would say to put as many classes as can be logically grouped in that file without making it too big and complex.