12 years of Pylint
or
How I Learned to Stop Worrying and Love the bugs
Claudiu Popa
analysing of a computer software without executing programs
you can benefit from using static analysis if:
- running tests takes a lot of time or work
- you don't have tests for a legacy system
- you need a form of automatic reviews
not equivalent to a review
1 2 3 4 5 6 7 8 9 10 | import os def process_stuff(params): executed = False if not params: raise ValueError('empty command list') # I didn't intended to put this here. for foo in params: # Oups, forgot to call it foo.execute |
$ pylint a.py W: 8: Unreachable code W: 10: Statement seems to have no effect W: 4: Unused variable 'executed' W: 1: Unused import os
using undefined variables
accessing undefined members
calling objects which aren't callable
$ cat a.py import zipfile f = zipfile.ZipFile(outfile, 'w', zipfile.DEFLATED) f() $ pylint a.py E: 2,20: Undefined variable 'outfile' (undefined-variable) E: 2,42: Module 'zipfile' has no 'DEFLATED' member (no-member) E: 3, 1: f is not callable (not-callable)
or special methods implemented incorrectly
$ cat a.py class MyContextManager(object): def __enter__(self): pass # It needs three arguments def __exit__(self): pass $ pylint a.py E: The special method '__exit__' expects 3 params, 0 was given
constant if conditions
$ cat a.py def func(): return bool(some_condition) # func is always true if func: pass $ pylint a.py W: 5: Using a conditional statement with a constant value
try to figure out what's the problem in this code.
should print 1, 2, 3, 4, ..., 9 right?
def bad_case2(): return [(lambda: var) for var in range(10)] for callable in bad_case2(): print(callable())
actually no:
$ python a.py 9 9 9 ... $ pylint a.py W: 2,20: Cell variable 'var' defined in loop
the previous code created a closure and var was looked up in the parent's scope when executed.
var in the parent's scope after the loop was 9.
one of the oldest (maintained) static analysis tool
created by Logilab (Sylvain Thenault) in 2003
Google uses its own version internally: gpylint
over 35000 lines of code + tests, according to ohloh.net
- pylint: 2416 commits, 21536 lines of code
- astroid: 1604 commits, 14045 lines of code
GPL licensed :-(
there's a split between the verifications (pylint) and the component that understands Python (astroid)
follows the general pattern of building a linter: uses ASTs
ASTs - abstract syntax trees - tree representation of the sintactic structure of source code
uses the ast module internally
from ast import parse, dump module = parse(''' def test(a, b, *, foo=None): pass ''') print(dump(module))
Footnotes
[1] | http://hackflow.com/blog/2015/03/29/metaprogramming-beyond-decency/ |
ast module is great, but it is not backwards compatible
astroid strives to be a compatibile layer between various new versions of ast
it has a similar API with the ast module
from astroid import parse module = parse(''' def test(a, b, *, foo=None): pass ''') print(module.repr_tree())
astroid nodes provide useful capabilities
you can get a node's parent:
>>> from astroid import extract_node >>> node = extract_node(''' def func(): f = 42 #@ ''') >>> node <Assign() l.2 [] at 0x2c49dd0> >>> node.parent <Function() 1.2 [] at 0x2c49d80> >>> node.parent.parent <Module() l.0 [] at 0x2c49d90>
you can get the children of a node
>>> node = extract_node(''' def test(): europython = 1 foo = 42 ''') >>> list(node.get_children()) [<Arguments() l.2 [] at 0x2bb2114208>, <Assign() l.3 [] at 0x2bb2114278>, <Assign() l.4 [] at 0x2bb2114320>]
you can get a node's lexical scope
>>> node = extract_node('a = 1') >>> node.scope() <Module() l.0 [] at 0x2c49d90> >>> node = extract_node(''' def test(): foo = 42 #@ ''') >>> node.scope() <Function(test) l.2 [] at 0x2bfbf10> >>> node = extract_node("[__(foo) for foo in range(10)]") >>> node.scope() <ListComp() l.2 [] at 0x795684240>
some nodes are augmented with capabilities tailored for them
klass = extract_node(''' from collections import OrderedDict class A(object): pass class B(object): pass class C(A, B): object class OmgMetaclasses(OrderedDict, C, metaclass=abc.ABCMeta): __slots__ = ('foo', 'bar') version = 1.0 ''')
getting a class's slots
>>> klass.slots() [<Const(str) l.4 [] at ...>, <Const(str) l.4 [] at ...>]
getting a class's metaclass
>>> klass.metaclass() <Class(ABCMeta) l.109 [abc] at 0x9cfd5e6470>
getting a class's method resolution order
>>> klass.mro() [<Class(OmgMetaclasses) l.8 [] at ...>, <Class(OrderedDict) l.43 [collections] at ...>, <Class(dict) l.0 [builtins] at ...>, <Class(C) l.6 [] at ...>, <Class(A) l.4 [] at ...>, <Class(B) l.5 [] at ...>, <Class(object) l.0 [builtins] at ...>]
n = extract_node(''' def func(arg): return arg + arg func(24) ''') >>> n CallFunc() l.5 [] at 0x6360d01b00> >>> inferred = next(n.infer()) <Const(int) l.None [int] at 0x94764b1908> >>> inferred.value 48
class A(object): def __init__(self): self.foo = 42 def __add__(self, other): return other.bar + self.foo / 2 class B(A): def __init__(self): self.bar = 24 def __radd__(self, other): return NotImplemented A() + B() >>> n <BinOp() l.12 [] at 0x66d4e9ce80> >>> inferred = next(n.infer()) >>> inferred.value 45.0
the transform is a function that receives a node and returns the same node modified or a completely new node
they need to be registered using an internal manager
def transform_six_add_metaclass(node): ... MANAGER.register_transform(nodes.Class, transform_six_add_metaclass, looks_like_six_add_metaclass)
you can filter the nodes you want to be transformed by using a filter function
we also provide a way to add new inference rules
we already use this API for understanding builtins: super, type, isinstance, callable, list, frozenset etc
def infer_super(node): # Return an iterator of results return iter(inference_results) MANAGER.register_transform(nodes.CallFunc, inference_tip(infer_super))
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 | class A(object): def spam(self): return "A" foo = 42 class B(A): def boo(self, a): print(a) class C(A): def boo(self, a, b): print(a, b) class E(C, B): def __init__(self): super(E, self).boo(4, 5) super(C, self).boo(5, 6) super(E, self).foo() super(E, self).spa |
Since astroid knows how super works and understands the method resolution order, pylint can detect the errors from the previous code
$ pylint a.py ... E: 14,26: Too many positional arguments for method call E: 15,26: super(E, self).foo is not callable E: 16,23: Super of 'E' has no 'spa' member
def real_func(): pass class A: @contextlib.contextmanager def meth(self): yield real_func a = [A(), 1, 2, 3][0] meth = hasattr(a, 'meth') and callable(a.meth) and getattr(a, 'meth') with meth() as foo: foo('EuroPython is great') $ pylint a.py ... E: Too many positional arguments for method call
pylint is a fancy walker over the tree provided by astroid
the verifications can be seen as patterns that are applied to certain nodes
it uses the visitor pattern to walk the tree
class TypeChecker(BaseChecker): def visit_callfunc(self, node): ...
import collections; print(collections.default)
we're using inference, but that doesn't help when having multiple lines of code modifying the same object
they need to be interpreted somehow. See this example for instance, no way to reason if the current instance has the attribute from line 5
1 2 3 4 5 | def __init__(self, **kwargs): self.__dict__.update(kwargs) def some_other_method(self): return self.some_arguments_set_in_dunder_init() |
comes with a lot of goodies and it has a vibrant ecosystem
you can write your own checker, even though that implies some knowledge of Python and how pylint works
plenty of additional packages tailored for specific frameworks: pylint-flask, pylint-django, pylint-celery, pylint-fields
run your checker as this:
$ pylint --load-plugins=plugin a.py
pyreverse - generate UML diagrams for your project
spell check your comments and docstrings (needs python-enchant to be installed)
$ pylint --spelling-dict=en_US a.py C: 1, 0: Wrong spelling of a word 'speling' in a docstring: Verify that the speling cheker work as expcted. ^^^^^^^ Did you mean: 'spieling' or 'spelling' or 'spelunking'?
Python 3 porting checker
def download_url(url): ... map(download_url, urls) # download_url will never be called class A: __metaclass__ = type def __setslice__(self, other): if not isinstance(other, basestring): ... $ pylint a.py --py3k W: 5, 0: map built-in referenced when not iterating W: 7, 0: Assigning to a class's __metaclass__ attribute W: 9, 8: __setslice__ method defined W: 10,36: basestring built-in referenced
pyflakes: lightweight, fast, but detects only handful of errors
promises not to have false positives or to warn about style issues
def test(): a, b = [1, 2, 3] # unbalanced tuple unpacking try: if None: # constant check pass except True: # catching non exception pass $ pyflakes a.py a.py:2: local variable 'a' is assigned to but never used a.py:2: local variable 'b' is assigned to but never used
pychecker: forefather of Pylint, not really static, ahead of its time, now dead
still detects issues that most of static analyzers don't detect
$ pychecker a.py a.py:2: Unpacking 3 values into 2 variables a.py:4: Using a conditional statement with a constant value a.py:6: Catching a non-Exception object (True)
jedi: autocompletion library, wants to be a static analyzer, a lot of hardcoded behaviour
$ python -m jedi linter a.py $ # it detected nothing :(
mypy: optional type checker, with support for type hints through annotations, Guido loves it, PEP 484 started from here. Still work in progress.
$ mypy a.py a.py: In function "test": a.py,line 2: Too many values to unpack (2 expected, 3 provided)
static analysis is great
but you can't fully understand code when:
- dynamic code is invoked
- extension modules are involved
- you don't understand flow control
- the code you're supposed to understand is too smart (namedtuple, enum, six.moves)
Some users actually expect static analysis tools to understand this kind of code
nose.trivial
for at in [ at for at in dir(_t) if at.startswith('assert') and not '_' in at ]: pepd = pep8(at) vars()[pepd] = getattr(_t, at) __all__.append(pepd)
multiprocessing
globals().update( (name, getattr(context._default_context, name)) for name in context._default_context.__all__)
Thank you!
Table of Contents | t |
---|---|
Exposé | ESC |
Full screen slides | e |
Presenter View | p |
Source Files | s |
Slide Numbers | n |
Toggle screen blanking | b |
Show/hide slide context | c |
Notes | 2 |
Help | h |