Pro tip: if you need to shove a lot of data through a Django view, DO NOT attempt to create a big string-- use a generator.
Here's something that seems sensible on the surface:
for region in regions:
for zip in region.zipcode.iterator():
body = body + "\t".join(
[region.region, zip.zipcode]
) + "\n"
resp = HttpResponse(body, mimetype='application/ms-excel')
resp['Content-Disposition'] = 'attachment; filename=%s.xls' % (unicode("Regions"),)
Except when the set of regions and zipcodes gets large! Let's consider the following replacement that uses a generator:
def dump_it():
for region in regions:
for zipcode in region.zipcode.iterator():
yield "\t".join(
[region.region, zipcode.zipcode]
) + "\n"
for region in regions:
for zipcode in region.zipcode.iterator():
yield "\t".join(
[region.region, zipcode.zipcode]
) + "\n"
resp = HttpResponse(dump_it(), mimetype='application/ms-excel')
resp['Content-Disposition'] = 'attachment; filename=%s.xls' % (unicode("Regions"),)There is a slight performance difference; the latter takes a few seconds (2 seconds for almost 70K resulting rows on my dog of a laptop). However, I attempted to benchmark the former on the same dataset, and it took almost 13 MINUTES (775 seconds). So, slight, meaning within 3 orders of magnitude.
No comments:
Post a Comment