Fix more bugs during code review

Python and Django improvements suggested right inside your Bitbucket pull request.

1
+
values = (
2
+
    'yes',
3
+
    'correct'

Missing commas in tuples results in implicit string concatenation. Probably not what you intended to do.

Read more
-
    'correct'
+
    'correct',
Expand 2 lines ...
4
+
    'agreed',
5
+
)

Cleaner and safer code

Code Review Doctor reviews your Bitbucket pull requests to prevent adding vulnerabilities, performance, or maintainability issues to your code. The code review runs in the cloud so there is nothing to install locally. Cutting edge technology achieves near-zero noise for developers so you can follow best practices without changing how you work.

Works with tools you use

Read about how it works.

It caught malformed test assertions that were incorrect. The automated codebase audit tool is so handy!

Jared Lockhart, Senior Software Engineer at Mozilla

Extremely positive. Suggested useful changes, giving our senior developers time back.

Jon Atkinson, Technical Director at Giant

Uncovered some bugs in the tests - thanks a lot!

Patrick von Platen, Research Engineer at Hugging Face

A quick demo

This is a sample of the kind of advice you can expect from Code Review Doctor during code review.

code-review-doctorbotwould suggest these changes right inside your pull requests on GitHub.
config/settings.py
21
21
def process_environment_variable(value):
22
22
    if value > 10:
23
23
        raise ValidationError('{value} is not allowed')

An f-string will not work if the f prefix is missing.

Read more
-
        raise ValidationError('{value} is not allowed')
+
        raise ValidationError(f'{value} is not allowed')
41
41
42
42
# https://docs.djangoproject.com/en/dev/ref/settings/#append-slash
43
43
APPEND_SLASH = True

Stating defaults add complexity when reading the code but does not change Django's behaviour.

Read more
-
APPEND_SLASH = True
44
44
45
45
# Application definition
62
62
    'whitenoise.middleware.WhiteNoiseMiddleware',

The order of middleware affections the outcome. Some middleware are dependant on the functionality of other middleware. For example a middleware that requires usage of request.session should come after the SessionMiddleware.

Read more

Your website is vulnerable to a number of common hacker attacks because MIDDLEWARE setting is missing django.middleware.security.SecurityMiddleware.

Read more
+
    'django.middleware.security.SecurityMiddleware',
Expand 12 lines ...
63
63
    'django.contrib.sessions.middleware.SessionMiddleware',
64
64
    'django.middleware.cache.UpdateCacheMiddleware',
65
65
    'django.middleware.common.CommonMiddleware',
66
66
    'django.middleware.cache.FetchFromCacheMiddleware',
67
67
    'django.middleware.csrf.CsrfViewMiddleware',
68
68
    'django.contrib.auth.middleware.AuthenticationMiddleware',
69
69
    'django.contrib.messages.middleware.MessageMiddleware',
70
70
    'django.middleware.clickjacking.XFrameOptionsMiddleware',
71
71
]
72
72
73
73
ROOT_URLCONF = 'config.urls'
74
74
75
75
TEMPLATES = [

DIRS must be absolute paths. Relative paths will not work.

Read more
76
76
    {
77
77
        'BACKEND': 'django.template.backends.django.DjangoTemplates',
78
78
        'DIRS': ['core/templates'],
79
79
        'APP_DIRS': True,
80
80
        'OPTIONS': {
81
81
            'context_processors': [
82
82
                'django.template.context_processors.debug',
83
83
                'django.template.context_processors.request',
84
84
                'django.template.context_processors.i18n',
85
85
                'django.contrib.auth.context_processors.auth',
86
86
                'django.contrib.messages.context_processors.messages',
87
87
                'staticpages.context_processors.get_pages',
88
88
                'staticpages.context_processors.get_categories',
89
89
                'staticpages.context_processors.get_urls',
90
90
            ],
91
91
        },
92
92
    },
93
93
]
94
94
95
95
# Static files served with Whitenoise and AWS Cloudfront
119
119
SECURE_CONTENT_TYPE_NOSNIFF = True
120
120
121
121
USE_I18N = True

Stating defaults add complexity when reading the code but does not change Django's behaviour.

Read more
-
USE_I18N = True
122
122
123
123
DIRECTORY_CONSTANTS_URL_GREAT_DOMESTIC = env.str(
148
148
SSO_PROFILE_URL = 'https://profile.com'

Your website is vulnerable because the CSRF_COOKIE_SECURE setting is not set - so hackers have an easier time stealing your CSRF cookies on HTTP connections, allowing them to circumvent your CSRF protection.

Read more

Your website is vulnerable because the SESSION_COOKIE_SECURE setting is not set - so hackers have an easier time stealing your users' session cookies on HTTP connections.

Read more
+
+
+
SESSION_COOKIE_SECURE = os.getenv('SESSION_COOKIE_SECURE_ENABLED') != 'False'
+
+
CSRF_COOKIE_SECURE = os.getenv('CSRF_COOKIE_SECURE_ENABLED') != 'False'
apps/aarvark/tests.py
25
25
        allocate(membership)
26
26
        assert membership.condition is not None
27
27
        assert membership.observation_set.all().count() is 0

Primitive data types such as strings and integers should be compared using == and != rather than is and is not.

Read more
-
        assert membership.observation_set.all().count() is 0
+
        assert membership.observation_set.all().count() == 0
Expand 2 lines ...
28
28
29
29
        membership.add_observations()
30
30
        assert membership.observation_set.all().count() is 1

Primitive data types such as strings and integers should be compared using == and != rather than is and is not.

Read more
-
        assert membership.observation_set.all().count() is 1
+
        assert membership.observation_set.all().count() == 1
Expand 2 lines ...
31
31
32
32
    def test_title_content(self):
34
34
        expected_object_name = f'{topic.title}'
35
35
        count = Topic.objects.count()
36
36
        self.assertEquals(expected_object_name, 'Test topic')

Don't use deprecated TestCase assertions like self.assertEquals. Use their modern counterpart instead like self.assertEqual.

Read more
-
        self.assertEquals(expected_object_name, 'Test topic')
+
        self.assertEqual(expected_object_name, 'Test topic')
37
37
        self.assertEquals(count,1)

Don't use deprecated TestCase assertions like self.assertEquals. Use their modern counterpart instead like self.assertEqual.

Read more
-
        self.assertEquals(count,1)
+
        self.assertEqual(count,1)
Expand 2 lines ...
38
38
39
39
    def test_one_object(self):
40
40
        self.assertTrue(len(DataCollector().profiles), 1)

assertTrue is not for comparing arguments. Consider assertEqual or others instead.

Read more
-
        self.assertTrue(len(DataCollector().profiles), 1)
+
        self.assertEqual(len(DataCollector().profiles), 1)
Expand 2 lines ...
41
41
42
42
    def test_pivot_dtaccessor(self):
67
67
            }
68
68
        )
69
69
        df["dt1"] = df["dt1"].apply(lambda d: pd.Timestamp(d))

Avoid unnecessarily wrapping a function in a lambda

Read more
-
        df["dt1"] = df["dt1"].apply(lambda d: pd.Timestamp(d))
+
        df["dt1"] = df["dt1"].apply(pd.Timestamp)
70
70
        df["dt2"] = df["dt2"].apply(lambda d: pd.Timestamp(d))

Avoid unnecessarily wrapping a function in a lambda

Read more
-
        df["dt2"] = df["dt2"].apply(lambda d: pd.Timestamp(d))
+
        df["dt2"] = df["dt2"].apply(pd.Timestamp)
Expand 2 lines ...
71
71
72
72
        result = pivot_table(
83
83
        )
84
84
        self.assertTrue(serializer.data[0]["is_control"])
85
85
        self.assertFalse(any([b["is_control"] for b in serializer.data[1:]]))

all and any can take a generator, so constructing a list first may be unnecessary.

Read more
-
        self.assertFalse(any([b["is_control"] for b in serializer.data[1:]]))
+
        self.assertFalse(any(b["is_control"] for b in serializer.data[1:]))
86
86
        self.assertEqual(sorted_treatment_ids, [b["id"] for b in serializer.data[1:]])
87
87
95
95
        ).render(self.mocked_context)
96
96
97
97
        self.assertTrue('{"name": "Tom Waits"}' in out)

assertIn and assertNotIn provide more helpful failure messages than assertTrue or assertFalse.

Read more
-
        self.assertTrue('{"name": "Tom Waits"}' in out)
+
        self.assertIn('{"name": "Tom Waits"}', out)
Expand 2 lines ...
98
98
99
99
    def test_can_be_hashed_including_non_ascii(self):
106
106
            user=self.user)
107
107
        hash = a.generate_hash()
108
108
        self.assertTrue(hash is not None)

assertIsNone and assertIsNotNone provide more helpful failure messages than assertTrue or assertFalse.

Read more
-
        self.assertTrue(hash is not None)
+
        self.assertIsNotNone(hash)
Expand 2 lines ...
109
109
110
110
    def test_is_valid_call_prerequisite_validators(self, mock_prerequisite_validator):
111
111
        prerequisite_string = "LOSIS1452 OU LPORT5896"
112
112
        program_tree = ProgramTreeFactory(),

A comma after a value turns it into tuple. Probably not what you intended to do.

Read more
-
        program_tree = ProgramTreeFactory(),
+
        program_tree = ProgramTreeFactory()
113
113
114
114
    def test_is_valid_call_prerequisite_validators(self, mock_prerequisite_validator):

Duplicate names for tests results in some tests being skipped.

Read more
-
    def test_is_valid_call_prerequisite_validators(self, mock_prerequisite_validator):
+
    def test_is_valid_call_prerequisite_validators_two(self, mock_prerequisite_validator):
Expand 2 lines ...
apps/aarvark/mixins.py
3
3
# ProbablyMeantTuple
4
4
5
5
class LoginRequiredMixin(object):

A class is inherited from object by default, so explicitly inheriting from object is redundant. Removing it keeps the code simpler.

Read more
-
class LoginRequiredMixin(object):
+
class LoginRequiredMixin:
Expand 2 lines ...
6
6
7
7
    redirect_field_name = 'next'
8
8
    login_url = None
9
9
    fields = ('title')

A tuple with only one element must end with a comma. Python won't know it's a tuple without the comma. Probably not what you intended to do.

Read more
-
    fields = ('title')
+
    fields = ('title',)
Expand 2 lines ...
10
10
11
11
    def format_product_option(self, product_string):
13
13
        try:
14
14
            list(map(int, products))
15
15
        except ValueError or SyntaxError:

Catch multiple exception types using a tuple, not with or operator.

Read more
-
        except ValueError or SyntaxError:
+
        except (ValueError, SyntaxError):
Expand 2 lines ...
apps/aarvark/cache.py
8
8
9
9
10
10
@dataclass

Use frozen=True to make the dataclasses immutable and hashable.

Read more
-
@dataclass
+
@dataclass(frozen=True)
Expand 2 lines ...
11
11
class Cache:
12
12
    name: str
14
14
15
15
16
16
class ArgInfo(NamedTuple):

Consider using the dataclass here instead for simplicity, performance gains, and consistency

Read more
-
class ArgInfo(NamedTuple):
+
@dataclass(frozen=True)
+
class ArgInfo:
    args: List[Value]
17
17
    args: List[Value]
18
18
    arg_names: List[Optional[str]]
23
23
    global _caches
24
24
    caches = _caches
25
25
    if caches == None:

Singleton data types such as booleans should be compared using is and is not rather than == and !=.

Read more
-
    if caches == None:
+
    if caches is None:
Expand 2 lines ...
26
26
        pidCache = dict((u.pid, u) for u in _databaseQuery().all())

Using list and dict comprehension is simpler and computationally quicker than calling list() and dict().

Read more
-
        pidCache = dict((u.pid, u) for u in _databaseQuery().all())
+
        pidCache = {u.pid: u for u in _databaseQuery().all()}
Expand 2 lines ...
27
27
        usernameCache = dict((u.username, u) for u in pidCache.values())

Using list and dict comprehension is simpler and computationally quicker than calling list() and dict().

Read more
-
        usernameCache = dict((u.username, u) for u in pidCache.values())
+
        usernameCache = {u.username: u for u in pidCache.values()}
Expand 2 lines ...
28
28
        idCache = dict((u.id, u) for u in pidCache.values())

Using list and dict comprehension is simpler and computationally quicker than calling list() and dict().

Read more
-
        idCache = dict((u.id, u) for u in pidCache.values())
+
        idCache = {u.id: u for u in pidCache.values()}
Expand 2 lines ...
29
29
        caches = (pidCache, usernameCache, idCache)
30
30
        _caches = caches
36
36
    def open(self, parent=None, detail_item=None, external=False):
37
37
        url = 'https://books.google.com/books'
38
38
        if True or external or self.config.get('open_external', False):

This code will always evaluate the same value regardless of the inputs.

Read more
46
46
    def __str__(self):
47
47
        course = self.course
48
48
        return "{} - {} - {}".format(self._type, course.id, course.subject.name)

f-string is easier to read, write, and less computationally expensive than legacy string formatting.

Read more
55
55
        with open('rack_elevation.css', 'w') as css_file:

Read and write calls will fail at runtime if open is called with incorrect modes.

Read more

Not specifying encoding when reading a file can cause UnicodeDecodeError because Python assumes the file is encoded with the OS's default text encoding, but that's often an invalid assumption.

Read more
-
        with open('rack_elevation.css', 'w') as css_file:
+
        with open('rack_elevation.css') as css_file:
apps/aarvark/ref_proxy.py
12
12
    _invoke_func = _local_invoke
13
13
    # Bypass ScriptModules when checking for async function attribute.
14
14
    bypass_type = issubclass(rref_type, torch.jit.ScriptModule) or issubclass(

issubclass can take multiple types, so there is no need to call them multiple times for each type.

Read more
-
    bypass_type = issubclass(rref_type, torch.jit.ScriptModule) or issubclass(
+
    bypass_type = issubclass(rref_type, (torch.jit.ScriptModule, torch._C.ScriptModule))
-
        rref_type, torch._C.ScriptModule
-
    )
    import ipdb; ipdb.set_trace()
15
15
        rref_type, torch._C.ScriptModule
16
16
    )
17
17
    import ipdb; ipdb.set_trace()

Don't commit breakpoints. This will pause execution in prod and break the app.

Read more
-
    import ipdb; ipdb.set_trace()
18
18
19
19
23
23
    conf = {}
24
24
    with open(self._conf_path, "r", encoding='utf-8') as conf_src:
25
25
        conf = json.loads(conf_src.read())

json.load(f) simplifies reading JSON from disk.

Read more
-
        conf = json.loads(conf_src.read())
+
        conf = json.load(conf_src)
26
26
    return conf
27
27
31
31
        f.write(json.dumps(result))

json.dump(f) simplifies writing JSON to disk.

Read more
-
        f.write(json.dumps(result))
+
        json.dump(result, f)
apps/aarvark/api.py
16
16
        user = PlatformUser.objects.get(pk=1)
17
17
18
18
        if user.group.id != get_request_user_organisation_id(request):

When working with foreign keys, accessing the related field will result in a database read. That can be eliminated by using *_id, which is the foreign key value that Django has already cached on the object to make this scenario more efficient.

Read more
-
        if user.group.id != get_request_user_organisation_id(request):
+
        if user.group_id != get_request_user_organisation_id(request):
Expand 3 lines ...
19
19
            raise PermissionDenied()
20
20
21
21
        return Response()
25
25
26
26
    def post(self, request, pk):
27
27
        if PlatformUser.objects.filter(group=1).count() > 0:

Comparing queryset.count() is less efficient than checking queryset.exists(), so use querySet.count() if you only want the count, and use queryset.exists() if you only want to find out if at least one result exists.

Read more
-
        if PlatformUser.objects.filter(group=1).count() > 0:
+
        if PlatformUser.objects.filter(group=1).exists():
Expand 2 lines ...
28
28
            raise PermissionDenied()
29
29
        return Response()
30
30
34
34
35
35
    def post(self, request, pk):
36
36
        if PlatformUser.objects.filter(group=1):

This can load every row in the database table into memory because the queryset is evaluated. Checking if a queryset is truthy/falsey is much less efficient than checking queryset.exists().

Read more
-
        if PlatformUser.objects.filter(group=1):
+
        if PlatformUser.objects.filter(group=1).exists():
Expand 2 lines ...
37
37
            raise PermissionDenied()
38
38
        return Response()
39
39
45
45
        user = PlatformUser.objects.all().order_by('?')[0]

Using order_by('?') can be very inefficient if you have lots of rows in the table. Moving the randomness to the application layer will probably give significant a performance improvement.

Read more
-
        user = PlatformUser.objects.all().order_by('?')[0]
+
        user = PlatformUser.objects.all()[random.randint(0, PlatformUser.objects.count() - 1)]
Use our GitHub pull request bot to prevent issue like these getting into your codebase.