Fix more bugs during code review

Python and Django improvements suggested right inside your GitHub pull request.

1	+	values = (
2	+	'yes',
3	+	'correct'

Missing commas in tuples results in implicit string concatenation. Probably not what you intended to do.

Suggested changes

-	'correct'
+	'correct',
	Expand 2 lines ...

	4	+	'agreed',
	5	+	)

Cleaner and safer code

Code Review Doctor reviews your GitHub pull requests to prevent adding vulnerabilities, performance, or maintainability issues to your code. The code review runs in the cloud so there is nothing to install locally. Cutting edge technology achieves near-zero noise for developers so you can follow best practices without changing how you work.

Works with tools you use

- Detect typos
- Avoid mistakes
- Prevent bugs
- Improve tests
- Read more
- Avoid bugs
- Best practice
- Improve security
- ORM performance
- Read more
- Review PRs
- One click fix
- Clear advice
- Delta aware
- Read more
- Review PRs
- Fix suggested
- Clear advice
- Delta aware
- Read more

Read about how it works.

It caught malformed test assertions that were incorrect. The automated codebase audit tool is so handy!

Jared Lockhart, Senior Software Engineer at Mozilla

Extremely positive. Suggested useful changes, giving our senior developers time back.

Jon Atkinson, Technical Director at Giant

Uncovered some bugs in the tests - thanks a lot!

Patrick von Platen, Research Engineer at Hugging Face

A quick demo

This is a sample of the kind of advice you can expect from Code Review Doctor during code review.

code-review-doctorbotwould suggest these changes right inside your pull requests on GitHub.

config/settings.py

21	21	def process_environment_variable(value):
22	22	if value > 10:
23	23	raise ValidationError('{value} is not allowed')

An f-string will not work if the f prefix is missing.

Suggested changes

-	raise ValidationError('{value} is not allowed')
+	raise ValidationError(f'{value} is not allowed')

41	41	⠀
42	42	# https://docs.djangoproject.com/en/dev/ref/settings/#append-slash
43	43	APPEND_SLASH = True

Stating defaults add complexity when reading the code but does not change Django's behaviour.

Suggested changes

-	APPEND_SLASH = True

44	44		⠀
45	45		# Application definition

    'whitenoise.middleware.WhiteNoiseMiddleware',

The order of middleware affections the outcome. Some middleware are dependant on the functionality of other middleware. For example a middleware that requires usage of request.session should come after the SessionMiddleware.

Your website is vulnerable to a number of common hacker attacks because MIDDLEWARE setting is missing django.middleware.security.SecurityMiddleware.

Suggested changes

+	'django.middleware.security.SecurityMiddleware',
	Expand 12 lines ...

    'django.contrib.sessions.middleware.SessionMiddleware',
    'django.middleware.cache.UpdateCacheMiddleware',
    'django.middleware.common.CommonMiddleware',
    'django.middleware.cache.FetchFromCacheMiddleware',
    'django.middleware.csrf.CsrfViewMiddleware',
    'django.contrib.auth.middleware.AuthenticationMiddleware',
    'django.contrib.messages.middleware.MessageMiddleware',
    'django.middleware.clickjacking.XFrameOptionsMiddleware',
]

ROOT_URLCONF = 'config.urls'

TEMPLATES = [

DIRS must be absolute paths. Relative paths will not work.

    {
        'BACKEND': 'django.template.backends.django.DjangoTemplates',
        'DIRS': ['core/templates'],
        'APP_DIRS': True,
        'OPTIONS': {
            'context_processors': [
                'django.template.context_processors.debug',
                'django.template.context_processors.request',
                'django.template.context_processors.i18n',
                'django.contrib.auth.context_processors.auth',
                'django.contrib.messages.context_processors.messages',
                'staticpages.context_processors.get_pages',
                'staticpages.context_processors.get_categories',
                'staticpages.context_processors.get_urls',
            ],
        },
    },
]

# Static files served with Whitenoise and AWS Cloudfront

119	119	SECURE_CONTENT_TYPE_NOSNIFF = True
120	120
121	121	USE_I18N = True

Stating defaults add complexity when reading the code but does not change Django's behaviour.

Suggested changes

-	USE_I18N = True

122	122		⠀
123	123		DIRECTORY_CONSTANTS_URL_GREAT_DOMESTIC = env.str(

SSO_PROFILE_URL = 'https://profile.com'

Your website is vulnerable because the CSRF_COOKIE_SECURE setting is not set - so hackers have an easier time stealing your CSRF cookies on HTTP connections, allowing them to circumvent your CSRF protection.

Your website is vulnerable because the SESSION_COOKIE_SECURE setting is not set - so hackers have an easier time stealing your users' session cookies on HTTP connections.

Suggested changes

-+
-+
-+
+SESSION_COOKIE_SECURE = os.getenv('SESSION_COOKIE_SECURE_ENABLED') != 'False'
-+
-+
+CSRF_COOKIE_SECURE = os.getenv('CSRF_COOKIE_SECURE_ENABLED') != 'False'

apps/aarvark/tests.py

25	25	allocate(membership)
26	26	assert membership.condition is not None
27	27	assert membership.observation_set.all().count() is 0

Primitive data types such as strings and integers should be compared using == and != rather than is and is not.

Suggested changes

-	assert membership.observation_set.all().count() is 0
+	assert membership.observation_set.all().count() == 0
	Expand 2 lines ...

28	28		⠀
29	29		membership.add_observations()

        assert membership.observation_set.all().count() is 1

Primitive data types such as strings and integers should be compared using == and != rather than is and is not.

Suggested changes

-	assert membership.observation_set.all().count() is 1
+	assert membership.observation_set.all().count() == 1
	Expand 2 lines ...

31	31		⠀
32	32		def test_title_content(self):

34	34	expected_object_name = f'{topic.title}'
35	35	count = Topic.objects.count()
36	36	self.assertEquals(expected_object_name, 'Test topic')

Don't use deprecated TestCase assertions like self.assertEquals. Use their modern counterpart instead like self.assertEqual.

Suggested changes

-	self.assertEquals(expected_object_name, 'Test topic')
+	self.assertEqual(expected_object_name, 'Test topic')

        self.assertEquals(count,1)

Don't use deprecated TestCase assertions like self.assertEquals. Use their modern counterpart instead like self.assertEqual.

Suggested changes

-	self.assertEquals(count,1)
+	self.assertEqual(count,1)
	Expand 2 lines ...

38	38		⠀
39	39		def test_one_object(self):

        self.assertTrue(len(DataCollector().profiles), 1)

assertTrue is not for comparing arguments. Consider assertEqual or others instead.

Suggested changes

-	self.assertTrue(len(DataCollector().profiles), 1)
+	self.assertEqual(len(DataCollector().profiles), 1)
	Expand 2 lines ...

41	41		⠀
42	42		def test_pivot_dtaccessor(self):

67	67	}
68	68	)
69	69	df["dt1"] = df["dt1"].apply(lambda d: pd.Timestamp(d))

Avoid unnecessarily wrapping a function in a lambda

Suggested changes

-	df["dt1"] = df["dt1"].apply(lambda d: pd.Timestamp(d))
+	df["dt1"] = df["dt1"].apply(pd.Timestamp)

        df["dt2"] = df["dt2"].apply(lambda d: pd.Timestamp(d))

Avoid unnecessarily wrapping a function in a lambda

Suggested changes

-	df["dt2"] = df["dt2"].apply(lambda d: pd.Timestamp(d))
+	df["dt2"] = df["dt2"].apply(pd.Timestamp)
	Expand 2 lines ...

71	71		⠀
72	72		result = pivot_table(

83	83	)
84	84	self.assertTrue(serializer.data[0]["is_control"])
85	85	self.assertFalse(any([b["is_control"] for b in serializer.data[1:]]))

all and any can take a generator, so constructing a list first may be unnecessary.

Suggested changes

-	self.assertFalse(any([b["is_control"] for b in serializer.data[1:]]))
+	self.assertFalse(any(b["is_control"] for b in serializer.data[1:]))

86	86		self.assertEqual(sorted_treatment_ids, [b["id"] for b in serializer.data[1:]])
87	87		⠀

95	95	).render(self.mocked_context)
96	96
97	97	self.assertTrue('{"name": "Tom Waits"}' in out)

assertIn and assertNotIn provide more helpful failure messages than assertTrue or assertFalse.

Suggested changes

-	self.assertTrue('{"name": "Tom Waits"}' in out)
+	self.assertIn('{"name": "Tom Waits"}', out)
	Expand 2 lines ...

98	98		⠀
99	99		def test_can_be_hashed_including_non_ascii(self):

106	106	user=self.user)
107	107	hash = a.generate_hash()
108	108	self.assertTrue(hash is not None)

assertIsNone and assertIsNotNone provide more helpful failure messages than assertTrue or assertFalse.

Suggested changes

-	self.assertTrue(hash is not None)
+	self.assertIsNotNone(hash)
	Expand 2 lines ...

109	109	⠀
110	110	def test_is_valid_call_prerequisite_validators(self, mock_prerequisite_validator):
111	111	prerequisite_string = "LOSIS1452 OU LPORT5896"

        program_tree = ProgramTreeFactory(),

A comma after a value turns it into tuple. Probably not what you intended to do.

Suggested changes

-	program_tree = ProgramTreeFactory(),
+	program_tree = ProgramTreeFactory()

⠀

    def test_is_valid_call_prerequisite_validators(self, mock_prerequisite_validator):

Duplicate names for tests results in some tests being skipped.

Suggested changes

-	def test_is_valid_call_prerequisite_validators(self, mock_prerequisite_validator):
+	def test_is_valid_call_prerequisite_validators_two(self, mock_prerequisite_validator):
	Expand 2 lines ...

apps/aarvark/mixins.py

3	3	# ProbablyMeantTuple
4	4
5	5	class LoginRequiredMixin(object):

A class is inherited from object by default, so explicitly inheriting from object is redundant. Removing it keeps the code simpler.

Suggested changes

-	class LoginRequiredMixin(object):
+	class LoginRequiredMixin:
	Expand 2 lines ...

6	6	⠀
7	7	redirect_field_name = 'next'
8	8	login_url = None

    fields = ('title')

A tuple with only one element must end with a comma. Python won't know it's a tuple without the comma. Probably not what you intended to do.

Suggested changes

-	fields = ('title')
+	fields = ('title',)
	Expand 2 lines ...

10	10		⠀
11	11		def format_product_option(self, product_string):

13	13	try:
14	14	list(map(int, products))
15	15	except ValueError or SyntaxError:

Catch multiple exception types using a tuple, not with or operator.

Suggested changes

-	except ValueError or SyntaxError:
+	except (ValueError, SyntaxError):
	Expand 2 lines ...

apps/aarvark/cache.py

8	8	⠀
9	9
10	10	@dataclass

Use frozen=True to make the dataclasses immutable and hashable.

Suggested changes

-	@dataclass
+	@dataclass(frozen=True)
	Expand 2 lines ...

11	11		class Cache:
12	12		name: str

14	14	⠀
15	15
16	16	class ArgInfo(NamedTuple):

Consider using the dataclass here instead for simplicity, performance gains, and consistency

Suggested changes

--
+class ArgInfo(NamedTuple):
-+
+@dataclass(frozen=True)
-+
+class ArgInfo:
     args: List[Value]

17	17		args: List[Value]
18	18		arg_names: List[Optional[str]]

23	23	global _caches
24	24	caches = _caches
25	25	if caches == None:

Singleton data types such as booleans should be compared using is and is not rather than == and !=.

Suggested changes

-	if caches == None:
+	if caches is None:
	Expand 2 lines ...

        pidCache = dict((u.pid, u) for u in _databaseQuery().all())

Using list and dict comprehension is simpler and computationally quicker than calling list() and dict().

Suggested changes

-	pidCache = dict((u.pid, u) for u in _databaseQuery().all())
+	pidCache = {u.pid: u for u in _databaseQuery().all()}
	Expand 2 lines ...

        usernameCache = dict((u.username, u) for u in pidCache.values())

Using list and dict comprehension is simpler and computationally quicker than calling list() and dict().

Suggested changes

-	usernameCache = dict((u.username, u) for u in pidCache.values())
+	usernameCache = {u.username: u for u in pidCache.values()}
	Expand 2 lines ...

        idCache = dict((u.id, u) for u in pidCache.values())

Using list and dict comprehension is simpler and computationally quicker than calling list() and dict().

Suggested changes

-	idCache = dict((u.id, u) for u in pidCache.values())
+	idCache = {u.id: u for u in pidCache.values()}
	Expand 2 lines ...

29	29		caches = (pidCache, usernameCache, idCache)
30	30		_caches = caches

36	36	def open(self, parent=None, detail_item=None, external=False):
37	37	url = 'https://books.google.com/books'
38	38	if True or external or self.config.get('open_external', False):

This code will always evaluate the same value regardless of the inputs.

46	46	def __str__(self):
47	47	course = self.course
48	48	return "{} - {} - {}".format(self._type, course.id, course.subject.name)

f-string is easier to read, write, and less computationally expensive than legacy string formatting.

        with open('rack_elevation.css', 'w') as css_file:

Read and write calls will fail at runtime if open is called with incorrect modes.

Not specifying encoding when reading a file can cause UnicodeDecodeError because Python assumes the file is encoded with the OS's default text encoding, but that's often an invalid assumption.

Suggested changes

-	with open('rack_elevation.css', 'w') as css_file:
+	with open('rack_elevation.css') as css_file:

apps/aarvark/ref_proxy.py

12	12	_invoke_func = _local_invoke
13	13	# Bypass ScriptModules when checking for async function attribute.
14	14	bypass_type = issubclass(rref_type, torch.jit.ScriptModule) or issubclass(

issubclass can take multiple types, so there is no need to call them multiple times for each type.

Suggested changes

--
+    bypass_type = issubclass(rref_type, torch.jit.ScriptModule) or issubclass(
-+
+    bypass_type = issubclass(rref_type, (torch.jit.ScriptModule, torch._C.ScriptModule))
--
+        rref_type, torch._C.ScriptModule
--
+    )
     import ipdb; ipdb.set_trace()

15	15		rref_type, torch._C.ScriptModule
16	16		)

    import ipdb; ipdb.set_trace()

Don't commit breakpoints. This will pause execution in prod and break the app.

Suggested changes

-	import ipdb; ipdb.set_trace()

18	18		⠀
19	19		⠀

23	23	conf = {}
24	24	with open(self._conf_path, "r", encoding='utf-8') as conf_src:
25	25	conf = json.loads(conf_src.read())

json.load(f) simplifies reading JSON from disk.

Suggested changes

-	conf = json.loads(conf_src.read())
+	conf = json.load(conf_src)

26	26		return conf
27	27		⠀

        f.write(json.dumps(result))

json.dump(f) simplifies writing JSON to disk.

Suggested changes

-	f.write(json.dumps(result))
+	json.dump(result, f)

apps/aarvark/api.py

16	16	user = PlatformUser.objects.get(pk=1)
17	17
18	18	if user.group.id != get_request_user_organisation_id(request):

When working with foreign keys, accessing the related field will result in a database read. That can be eliminated by using *_id, which is the foreign key value that Django has already cached on the object to make this scenario more efficient.

Suggested changes

-	if user.group.id != get_request_user_organisation_id(request):
+	if user.group_id != get_request_user_organisation_id(request):
	Expand 3 lines ...

19	19	raise PermissionDenied()
20	20
21	21	return Response()

25	25	⠀
26	26	def post(self, request, pk):
27	27	if PlatformUser.objects.filter(group=1).count() > 0:

Comparing queryset.count() is less efficient than checking queryset.exists(), so use querySet.count() if you only want the count, and use queryset.exists() if you only want to find out if at least one result exists.

Suggested changes

-	if PlatformUser.objects.filter(group=1).count() > 0:
+	if PlatformUser.objects.filter(group=1).exists():
	Expand 2 lines ...

28	28	raise PermissionDenied()
29	29	return Response()
30	30	⠀

34	34	⠀
35	35	def post(self, request, pk):
36	36	if PlatformUser.objects.filter(group=1):

This can load every row in the database table into memory because the queryset is evaluated. Checking if a queryset is truthy/falsey is much less efficient than checking queryset.exists().

Suggested changes

-	if PlatformUser.objects.filter(group=1):
+	if PlatformUser.objects.filter(group=1).exists():
	Expand 2 lines ...

37	37	raise PermissionDenied()
38	38	return Response()
39	39	⠀

        user = PlatformUser.objects.all().order_by('?')[0]

Using order_by('?') can be very inefficient if you have lots of rows in the table. Moving the randomness to the application layer will probably give significant a performance improvement.

Suggested changes

-	user = PlatformUser.objects.all().order_by('?')[0]
+	user = PlatformUser.objects.all()[random.randint(0, PlatformUser.objects.count() - 1)]

Use our GitHub pull request bot to prevent issue like these getting into your codebase.

Setup GitHub integration

63	63	'django.contrib.sessions.middleware.SessionMiddleware',
64	64	'django.middleware.cache.UpdateCacheMiddleware',
65	65	'django.middleware.common.CommonMiddleware',
66	66	'django.middleware.cache.FetchFromCacheMiddleware',
67	67	'django.middleware.csrf.CsrfViewMiddleware',
68	68	'django.contrib.auth.middleware.AuthenticationMiddleware',
69	69	'django.contrib.messages.middleware.MessageMiddleware',
70	70	'django.middleware.clickjacking.XFrameOptionsMiddleware',
71	71	]
72	72
73	73	ROOT_URLCONF = 'config.urls'
74	74
75	75	TEMPLATES = [

76	76	{
77	77	'BACKEND': 'django.template.backends.django.DjangoTemplates',
78	78	'DIRS': ['core/templates'],
79	79	'APP_DIRS': True,
80	80	'OPTIONS': {
81	81	'context_processors': [
82	82	'django.template.context_processors.debug',
83	83	'django.template.context_processors.request',
84	84	'django.template.context_processors.i18n',
85	85	'django.contrib.auth.context_processors.auth',
86	86	'django.contrib.messages.context_processors.messages',
87	87	'staticpages.context_processors.get_pages',
88	88	'staticpages.context_processors.get_categories',
89	89	'staticpages.context_processors.get_urls',
90	90	],
91	91	},
92	92	},
93	93	]
94	94
95	95	# Static files served with Whitenoise and AWS Cloudfront