Fix more bugs during code review

Python and Django improvements suggested right inside your GitHub pull request.

1
+
values = (
2
+
    'yes',
3
+
    'correct'

Missing commas in tuples results in implicit string concatenation. Probably not what you intended to do.

Read more
Suggested changes
-
    'correct'
+
    'correct',
Expand 2 lines ...
Commit suggestion
4
+
    'agreed',
5
+
)

Cleaner and safer code

Code Review Doctor reviews your GitHub pull requests to prevent adding vulnerabilities, performance, or maintainability issues to your code. The code review runs in the cloud so there is nothing to install locally. Cutting edge technology achieves near-zero noise for developers so you can follow best practices without changing how you work.

Works with tools you use

Read about how it works.

It caught malformed test assertions that were incorrect. The automated codebase audit tool is so handy!

Jared Lockhart, Senior Software Engineer at Mozilla

Extremely positive. Suggested useful changes, giving our senior developers time back.

Jon Atkinson, Technical Director at Giant

Uncovered some bugs in the tests - thanks a lot!

Patrick von Platen, Research Engineer at Hugging Face

A quick demo

This is a sample of the kind of advice you can expect from Code Review Doctor during code review.

code-review-doctorbotwould suggest these changes right inside your pull requests on GitHub.
config/settings.py
21
21
def process_environment_variable(value):
22
22
    if value > 10:
23
23
        raise ValidationError('{value} is not allowed')

An f-string will not work if the f prefix is missing.

Read more
Suggested changes
-
        raise ValidationError('{value} is not allowed')
+
        raise ValidationError(f'{value} is not allowed')
41
41
42
42
# https://docs.djangoproject.com/en/dev/ref/settings/#append-slash
43
43
APPEND_SLASH = True

Stating defaults add complexity when reading the code but does not change Django's behaviour.

Read more
Suggested changes
-
APPEND_SLASH = True
44
44
45
45
# Application definition
62
62
    'whitenoise.middleware.WhiteNoiseMiddleware',

The order of middleware affections the outcome. Some middleware are dependant on the functionality of other middleware. For example a middleware that requires usage of request.session should come after the SessionMiddleware.

Read more

Your website is vulnerable to a number of common hacker attacks because MIDDLEWARE setting is missing django.middleware.security.SecurityMiddleware.

Read more
Suggested changes
+
    'django.middleware.security.SecurityMiddleware',
Expand 12 lines ...
63
63
    'django.contrib.sessions.middleware.SessionMiddleware',
64
64
    'django.middleware.cache.UpdateCacheMiddleware',
65
65
    'django.middleware.common.CommonMiddleware',
66
66
    'django.middleware.cache.FetchFromCacheMiddleware',
67
67
    'django.middleware.csrf.CsrfViewMiddleware',
68
68
    'django.contrib.auth.middleware.AuthenticationMiddleware',
69
69
    'django.contrib.messages.middleware.MessageMiddleware',
70
70
    'django.middleware.clickjacking.XFrameOptionsMiddleware',
71
71
]
72
72
73
73
ROOT_URLCONF = 'config.urls'
74
74
75
75
TEMPLATES = [

DIRS must be absolute paths. Relative paths will not work.

Read more
76
76
    {
77
77
        'BACKEND': 'django.template.backends.django.DjangoTemplates',
78
78
        'DIRS': ['core/templates'],
79
79
        'APP_DIRS': True,
80
80
        'OPTIONS': {
81
81
            'context_processors': [
82
82
                'django.template.context_processors.debug',
83
83
                'django.template.context_processors.request',
84
84
                'django.template.context_processors.i18n',
85
85
                'django.contrib.auth.context_processors.auth',
86
86
                'django.contrib.messages.context_processors.messages',
87
87
                'staticpages.context_processors.get_pages',
88
88
                'staticpages.context_processors.get_categories',
89
89
                'staticpages.context_processors.get_urls',
90
90
            ],
91
91
        },
92
92
    },
93
93
]
94
94
95
95
# Static files served with Whitenoise and AWS Cloudfront
119
119
SECURE_CONTENT_TYPE_NOSNIFF = True
120
120
121
121
USE_I18N = True

Stating defaults add complexity when reading the code but does not change Django's behaviour.

Read more
Suggested changes
-
USE_I18N = True
122
122
123
123
DIRECTORY_CONSTANTS_URL_GREAT_DOMESTIC = env.str(
148
148
SSO_PROFILE_URL = 'https://profile.com'

Your website is vulnerable because the CSRF_COOKIE_SECURE setting is not set - so hackers have an easier time stealing your CSRF cookies on HTTP connections, allowing them to circumvent your CSRF protection.

Read more

Your website is vulnerable because the SESSION_COOKIE_SECURE setting is not set - so hackers have an easier time stealing your users' session cookies on HTTP connections.

Read more
Suggested changes
+
+
+
SESSION_COOKIE_SECURE = os.getenv('SESSION_COOKIE_SECURE_ENABLED') != 'False'
+
+
CSRF_COOKIE_SECURE = os.getenv('CSRF_COOKIE_SECURE_ENABLED') != 'False'
apps/aarvark/tests.py
25
25
        allocate(membership)
26
26
        assert membership.condition is not None
27
27
        assert membership.observation_set.all().count() is 0

Primitive data types such as strings and integers should be compared using == and != rather than is and is not.

Read more
Suggested changes
-
        assert membership.observation_set.all().count() is 0
+
        assert membership.observation_set.all().count() == 0
Expand 2 lines ...
28
28
29
29
        membership.add_observations()
30
30
        assert membership.observation_set.all().count() is 1

Primitive data types such as strings and integers should be compared using == and != rather than is and is not.

Read more
Suggested changes
-
        assert membership.observation_set.all().count() is 1
+
        assert membership.observation_set.all().count() == 1
Expand 2 lines ...
31
31
32
32
    def test_title_content(self):
34
34
        expected_object_name = f'{topic.title}'
35
35
        count = Topic.objects.count()
36
36
        self.assertEquals(expected_object_name, 'Test topic')

Don't use deprecated TestCase assertions like self.assertEquals. Use their modern counterpart instead like self.assertEqual.

Read more
Suggested changes
-
        self.assertEquals(expected_object_name, 'Test topic')
+
        self.assertEqual(expected_object_name, 'Test topic')
37
37
        self.assertEquals(count,1)

Don't use deprecated TestCase assertions like self.assertEquals. Use their modern counterpart instead like self.assertEqual.

Read more
Suggested changes
-
        self.assertEquals(count,1)
+
        self.assertEqual(count,1)
Expand 2 lines ...
38
38
39
39
    def test_one_object(self):
40
40
        self.assertTrue(len(DataCollector().profiles), 1)

assertTrue is not for comparing arguments. Consider assertEqual or others instead.

Read more
Suggested changes
-
        self.assertTrue(len(DataCollector().profiles), 1)
+
        self.assertEqual(len(DataCollector().profiles), 1)
Expand 2 lines ...
41
41
42
42
    def test_pivot_dtaccessor(self):
67
67
            }
68
68
        )
69
69
        df["dt1"] = df["dt1"].apply(lambda d: pd.Timestamp(d))

Avoid unnecessarily wrapping a function in a lambda

Read more
Suggested changes
-
        df["dt1"] = df["dt1"].apply(lambda d: pd.Timestamp(d))
+
        df["dt1"] = df["dt1"].apply(pd.Timestamp)
70
70
        df["dt2"] = df["dt2"].apply(lambda d: pd.Timestamp(d))

Avoid unnecessarily wrapping a function in a lambda

Read more
Suggested changes
-
        df["dt2"] = df["dt2"].apply(lambda d: pd.Timestamp(d))
+
        df["dt2"] = df["dt2"].apply(pd.Timestamp)
Expand 2 lines ...
71
71
72
72
        result = pivot_table(
83
83
        )
84
84
        self.assertTrue(serializer.data[0]["is_control"])
85
85
        self.assertFalse(any([b["is_control"] for b in serializer.data[1:]]))

all and any can take a generator, so constructing a list first may be unnecessary.

Read more
Suggested changes
-
        self.assertFalse(any([b["is_control"] for b in serializer.data[1:]]))
+
        self.assertFalse(any(b["is_control"] for b in serializer.data[1:]))
86
86
        self.assertEqual(sorted_treatment_ids, [b["id"] for b in serializer.data[1:]])
87
87
95
95
        ).render(self.mocked_context)
96
96
97
97
        self.assertTrue('{"name": "Tom Waits"}' in out)

assertIn and assertNotIn provide more helpful failure messages than assertTrue or assertFalse.

Read more
Suggested changes
-
        self.assertTrue('{"name": "Tom Waits"}' in out)
+
        self.assertIn('{"name": "Tom Waits"}', out)
Expand 2 lines ...
98
98
99
99
    def test_can_be_hashed_including_non_ascii(self):
106
106
            user=self.user)
107
107
        hash = a.generate_hash()
108
108
        self.assertTrue(hash is not None)

assertIsNone and assertIsNotNone provide more helpful failure messages than assertTrue or assertFalse.

Read more
Suggested changes
-
        self.assertTrue(hash is not None)
+
        self.assertIsNotNone(hash)
Expand 2 lines ...
109
109
110
110
    def test_is_valid_call_prerequisite_validators(self, mock_prerequisite_validator):
111
111
        prerequisite_string = "LOSIS1452 OU LPORT5896"
112
112
        program_tree = ProgramTreeFactory(),

A comma after a value turns it into tuple. Probably not what you intended to do.

Read more
Suggested changes
-
        program_tree = ProgramTreeFactory(),
+
        program_tree = ProgramTreeFactory()
113
113
114
114
    def test_is_valid_call_prerequisite_validators(self, mock_prerequisite_validator):

Duplicate names for tests results in some tests being skipped.

Read more
Suggested changes
-
    def test_is_valid_call_prerequisite_validators(self, mock_prerequisite_validator):
+
    def test_is_valid_call_prerequisite_validators_two(self, mock_prerequisite_validator):
Expand 2 lines ...
apps/aarvark/mixins.py
3
3
# ProbablyMeantTuple
4
4
5
5
class LoginRequiredMixin(object):

A class is inherited from object by default, so explicitly inheriting from object is redundant. Removing it keeps the code simpler.

Read more
Suggested changes
-
class LoginRequiredMixin(object):
+
class LoginRequiredMixin:
Expand 2 lines ...
6
6
7
7
    redirect_field_name = 'next'
8
8
    login_url = None
9
9
    fields = ('title')

A tuple with only one element must end with a comma. Python won't know it's a tuple without the comma. Probably not what you intended to do.

Read more
Suggested changes
-
    fields = ('title')
+
    fields = ('title',)
Expand 2 lines ...
10
10
11
11
    def format_product_option(self, product_string):
13
13
        try:
14
14
            list(map(int, products))
15
15
        except ValueError or SyntaxError:

Catch multiple exception types using a tuple, not with or operator.

Read more
Suggested changes
-
        except ValueError or SyntaxError:
+
        except (ValueError, SyntaxError):
Expand 2 lines ...
apps/aarvark/cache.py
8
8
9
9
10
10
@dataclass

Use frozen=True to make the dataclasses immutable and hashable.

Read more
Suggested changes
-
@dataclass
+
@dataclass(frozen=True)
Expand 2 lines ...
11
11
class Cache:
12
12
    name: str
14
14
15
15
16
16
class ArgInfo(NamedTuple):

Consider using the dataclass here instead for simplicity, performance gains, and consistency

Read more
Suggested changes
-
class ArgInfo(NamedTuple):
+
@dataclass(frozen=True)
+
class ArgInfo:
    args: List[Value]
17
17
    args: List[Value]
18
18
    arg_names: List[Optional[str]]
23
23
    global _caches
24
24
    caches = _caches
25
25
    if caches == None:

Singleton data types such as booleans should be compared using is and is not rather than == and !=.

Read more
Suggested changes
-
    if caches == None:
+
    if caches is None:
Expand 2 lines ...
26
26
        pidCache = dict((u.pid, u) for u in _databaseQuery().all())

Using list and dict comprehension is simpler and computationally quicker than calling list() and dict().

Read more
Suggested changes
-
        pidCache = dict((u.pid, u) for u in _databaseQuery().all())
+
        pidCache = {u.pid: u for u in _databaseQuery().all()}
Expand 2 lines ...
27
27
        usernameCache = dict((u.username, u) for u in pidCache.values())

Using list and dict comprehension is simpler and computationally quicker than calling list() and dict().

Read more
Suggested changes
-
        usernameCache = dict((u.username, u) for u in pidCache.values())
+
        usernameCache = {u.username: u for u in pidCache.values()}
Expand 2 lines ...
28
28
        idCache = dict((u.id, u) for u in pidCache.values())

Using list and dict comprehension is simpler and computationally quicker than calling list() and dict().

Read more
Suggested changes
-
        idCache = dict((u.id, u) for u in pidCache.values())
+
        idCache = {u.id: u for u in pidCache.values()}
Expand 2 lines ...
29
29
        caches = (pidCache, usernameCache, idCache)
30
30
        _caches = caches
36
36
    def open(self, parent=None, detail_item=None, external=False):
37
37
        url = 'https://books.google.com/books'
38
38
        if True or external or self.config.get('open_external', False):

This code will always evaluate the same value regardless of the inputs.

Read more
46
46
    def __str__(self):
47
47
        course = self.course
48
48
        return "{} - {} - {}".format(self._type, course.id, course.subject.name)

f-string is easier to read, write, and less computationally expensive than legacy string formatting.

Read more
55
55
        with open('rack_elevation.css', 'w') as css_file:

Read and write calls will fail at runtime if open is called with incorrect modes.

Read more

Not specifying encoding when reading a file can cause UnicodeDecodeError because Python assumes the file is encoded with the OS's default text encoding, but that's often an invalid assumption.

Read more
Suggested changes
-
        with open('rack_elevation.css', 'w') as css_file:
+
        with open('rack_elevation.css') as css_file:
apps/aarvark/ref_proxy.py
12
12
    _invoke_func = _local_invoke
13
13
    # Bypass ScriptModules when checking for async function attribute.
14
14
    bypass_type = issubclass(rref_type, torch.jit.ScriptModule) or issubclass(

issubclass can take multiple types, so there is no need to call them multiple times for each type.

Read more
Suggested changes
-
    bypass_type = issubclass(rref_type, torch.jit.ScriptModule) or issubclass(
+
    bypass_type = issubclass(rref_type, (torch.jit.ScriptModule, torch._C.ScriptModule))
-
        rref_type, torch._C.ScriptModule
-
    )
    import ipdb; ipdb.set_trace()
15
15
        rref_type, torch._C.ScriptModule
16
16
    )
17
17
    import ipdb; ipdb.set_trace()

Don't commit breakpoints. This will pause execution in prod and break the app.

Read more
Suggested changes
-
    import ipdb; ipdb.set_trace()
18
18
19
19
23
23
    conf = {}
24
24
    with open(self._conf_path, "r", encoding='utf-8') as conf_src:
25
25
        conf = json.loads(conf_src.read())

json.load(f) simplifies reading JSON from disk.

Read more
Suggested changes
-
        conf = json.loads(conf_src.read())
+
        conf = json.load(conf_src)
26
26
    return conf
27
27
31
31
        f.write(json.dumps(result))

json.dump(f) simplifies writing JSON to disk.

Read more
Suggested changes
-
        f.write(json.dumps(result))
+
        json.dump(result, f)
apps/aarvark/api.py
16
16
        user = PlatformUser.objects.get(pk=1)
17
17
18
18
        if user.group.id != get_request_user_organisation_id(request):

When working with foreign keys, accessing the related field will result in a database read. That can be eliminated by using *_id, which is the foreign key value that Django has already cached on the object to make this scenario more efficient.

Read more
Suggested changes
-
        if user.group.id != get_request_user_organisation_id(request):
+
        if user.group_id != get_request_user_organisation_id(request):
Expand 3 lines ...
19
19
            raise PermissionDenied()
20
20
21
21
        return Response()
25
25
26
26
    def post(self, request, pk):
27
27
        if PlatformUser.objects.filter(group=1).count() > 0:

Comparing queryset.count() is less efficient than checking queryset.exists(), so use querySet.count() if you only want the count, and use queryset.exists() if you only want to find out if at least one result exists.

Read more
Suggested changes
-
        if PlatformUser.objects.filter(group=1).count() > 0:
+
        if PlatformUser.objects.filter(group=1).exists():
Expand 2 lines ...
28
28
            raise PermissionDenied()
29
29
        return Response()
30
30
34
34
35
35
    def post(self, request, pk):
36
36
        if PlatformUser.objects.filter(group=1):

This can load every row in the database table into memory because the queryset is evaluated. Checking if a queryset is truthy/falsey is much less efficient than checking queryset.exists().

Read more
Suggested changes
-
        if PlatformUser.objects.filter(group=1):
+
        if PlatformUser.objects.filter(group=1).exists():
Expand 2 lines ...
37
37
            raise PermissionDenied()
38
38
        return Response()
39
39
45
45
        user = PlatformUser.objects.all().order_by('?')[0]

Using order_by('?') can be very inefficient if you have lots of rows in the table. Moving the randomness to the application layer will probably give significant a performance improvement.

Read more
Suggested changes
-
        user = PlatformUser.objects.all().order_by('?')[0]
+
        user = PlatformUser.objects.all()[random.randint(0, PlatformUser.objects.count() - 1)]
Use our GitHub pull request bot to prevent issue like these getting into your codebase.