{"id":126,"date":"2020-09-02T21:46:16","date_gmt":"2020-09-02T18:16:16","guid":{"rendered":"http:\/\/mahshad.pro\/?p=126"},"modified":"2021-02-11T23:07:15","modified_gmt":"2021-02-11T19:37:15","slug":"how-to-make-animated-choropleth-maps-with-discrete-colors-using-python-and-plotly","status":"publish","type":"post","link":"https:\/\/mahshad.pro\/?p=126","title":{"rendered":"How to make animated choropleth maps with discrete colors using Python and Plotly"},"content":{"rendered":"\n<p><\/p>\n\n\n\n<p>This tutorial is also available on <a href=\"https:\/\/medium.com\/@mahshadn\/animated-choropleth-map-with-discrete-colors-using-python-and-plotly-styling-5e208e5b6bf8\">Medium.com<\/a> <br><\/p>\n\n\n\n<p>Since I was a little boy I have been obsessed with the maps! I remember that I used to buy paper sheet maps of different continents and have them pinned on my room\u2019s wall. Back in such old days, maps were used to be some static shapes that conveyed a limited amount of information. Fast forward to nowadays, we barely pass a day without benefiting from advancements of digital maps. <\/p>\n\n\n\n<figure class=\"wp-block-image\"><img loading=\"lazy\" decoding=\"async\" width=\"1024\" height=\"633\" src=\"http:\/\/mahshad.pro\/wp-content\/uploads\/british-library-qYMlpeQypGU-unsplash-1024x633.jpg\" alt=\"\" class=\"wp-image-128\" srcset=\"https:\/\/mahshad.pro\/wp-content\/uploads\/british-library-qYMlpeQypGU-unsplash-1024x633.jpg 1024w, https:\/\/mahshad.pro\/wp-content\/uploads\/british-library-qYMlpeQypGU-unsplash-300x185.jpg 300w, https:\/\/mahshad.pro\/wp-content\/uploads\/british-library-qYMlpeQypGU-unsplash-768x475.jpg 768w, https:\/\/mahshad.pro\/wp-content\/uploads\/british-library-qYMlpeQypGU-unsplash-1870x1156.jpg 1870w, https:\/\/mahshad.pro\/wp-content\/uploads\/british-library-qYMlpeQypGU-unsplash-400x247.jpg 400w, https:\/\/mahshad.pro\/wp-content\/uploads\/british-library-qYMlpeQypGU-unsplash-800x495.jpg 800w\" sizes=\"auto, (max-width: 1024px) 100vw, 1024px\" \/><figcaption> Image taken from the page 1130 of \u2018History of England and the British Empire. Photo by <a rel=\"noreferrer noopener\" href=\"https:\/\/unsplash.com\/@britishlibrary?utm_source=unsplash&amp;utm_medium=referral&amp;utm_content=creditCopyText\" target=\"_blank\">British Library<\/a> on <a rel=\"noreferrer noopener\" href=\"https:\/\/unsplash.com\/?utm_source=unsplash&amp;utm_medium=referral&amp;utm_content=creditCopyText\" target=\"_blank\">Unsplash<\/a> <br><\/figcaption><\/figure>\n\n\n\n<p><a rel=\"noreferrer noopener\" href=\"https:\/\/plotly.com\/python\/\" target=\"_blank\">Plotly<\/a> is a data visualization library that provides a wide variety of basic visualization charts, statistical charts, scientific charts, financial charts, maps, 3D charts, animated graphs, etc. for different types of visualization applications. <a rel=\"noreferrer noopener\" href=\"https:\/\/plotly.com\/python\/plotly-express\/\" target=\"_blank\">Plotly Express<\/a> is an easy-to-use and high-level interface to Plotly, which operates with a variety of data and make it easy to create professional looking graphs. We can either use Mapbox or Plotly\u2019s built-in maps for visualization. We will use Plotly\u2019s maps in this tutorial.<\/p>\n\n\n\n<blockquote class=\"wp-block-quote is-layout-flow wp-block-quote-is-layout-flow\"><p><strong>Animated maps are useful when we want to show a situation or value change in a variety of geographic regions over the course of&nbsp;time.<\/strong><\/p><\/blockquote>\n\n\n\n<h4 class=\"wp-block-heading\" id=\"3ba5\">Dataset<\/h4>\n\n\n\n<p>Since COVID-19 has been a hot topic during past few months and most users are familiar with how quick the virus can spread, we used the COVID-19 data set from&nbsp;<a rel=\"noreferrer noopener\" href=\"https:\/\/health-infobase.canada.ca\/src\/data\/covidLive\/covid19.csv\" target=\"_blank\">Government of Canada\u2019s website<\/a>&nbsp;for this tutorial. The ultimate objective here is to create an animated choropleth map of Canadian provinces that shows the spread of COVID-19 through a weekly time frame.<\/p>\n\n\n\n<h4 class=\"wp-block-heading\">Geojson data<\/h4>\n\n\n\n<p>For choropleth maps we will need a&nbsp;.geojson file which in simple word, indicates the boundaries for a region with sets of vectors. Other information related to the boundary such as name, description, etc. is usually include in the&nbsp;.geojson file. For Canadian provinces geojson file, you can either download the boundary file from <a rel=\"noreferrer noopener\" href=\"https:\/\/www12.statcan.gc.ca\/census-recensement\/2011\/geo\/bound-limit\/bound-limit-2011-eng.cfm\" target=\"_blank\">Canadian statistics<\/a> and convert it to a geojson file, or use the available geojson files. For this project I am using the geojson file taken from <a rel=\"noreferrer noopener\" href=\"https:\/\/thomson.carto.com\/tables\/canada_provinces\/public\/map\" target=\"_blank\">Carto<\/a>.<\/p>\n\n\n\n<h4 class=\"wp-block-heading\"><strong>Data cleaning<\/strong><\/h4>\n\n\n\n<p>We will need the date, number of cases, province id and name. Since we would like to have a weekly time frame, we will create <code>timeframe <\/code>column which shows the month name and week number of that month as below. We can make such adjustments either in Excel or directly in Python. The cleaned dataset file is available on my <a rel=\"noreferrer noopener\" href=\"https:\/\/github.com\/Mahshadn\/canadian_map\" target=\"_blank\">GitHub<\/a>.<\/p>\n\n\n\n<figure class=\"wp-block-image\"><img loading=\"lazy\" decoding=\"async\" width=\"704\" height=\"274\" src=\"http:\/\/mahshad.pro\/wp-content\/uploads\/firstdata.png\" alt=\"\" class=\"wp-image-129\" srcset=\"https:\/\/mahshad.pro\/wp-content\/uploads\/firstdata.png 704w, https:\/\/mahshad.pro\/wp-content\/uploads\/firstdata-300x117.png 300w, https:\/\/mahshad.pro\/wp-content\/uploads\/firstdata-400x156.png 400w\" sizes=\"auto, (max-width: 704px) 100vw, 704px\" \/><figcaption> dataset after few adjustments in date format and the weekly time frame <\/figcaption><\/figure>\n\n\n\n<p>As you can see in the dataset, we have the total number of cases for each day for each province. By using \u2018cases\u2019 as the color value we will have a continuous color legend. But since we would like to have discrete color setting, we need to add a new column called \u2018category\u2019 and assign each category based on the number of cases for each row as below:<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>import pandas as pd\n\ndf = pd.read_csv(\"ca_pr_day_n.csv\")\ndf&#91;'category'] = ''\n\n#categorizing the number of cases and assign each category to each row\ndef set_cat(row):\n    if row&#91;'cases'] == 0:\n        return '0'\n    if row&#91;'cases'] &gt; 0 and row&#91;'cases'] &lt; 1001:\n        return '1 - 1,000'\n    if row&#91;'cases'] &gt; 1001 and row&#91;'cases'] &lt; 5001:\n        return '1,001 - 5,000'\n    if row&#91;'cases'] &gt; 5001 and row&#91;'cases'] &lt; 10001:\n        return '5,001 - 10,000'\n    if row&#91;'cases'] &gt; 10001 and row&#91;'cases'] &lt; 30001:\n        return '10,001 - 30,000'\n    if row&#91;'cases'] &gt; 30001 and row&#91;'cases'] &lt; 50001:\n        return '30,001 - 50,000'\n    if row&#91;'cases'] &gt; 50001:\n        return '50,001 and higher'\n\ndf = df.assign(category=df.apply(set_cat, axis=1))<\/code><\/pre>\n\n\n\n<h4 class=\"wp-block-heading\">Map creation<\/h4>\n\n\n\n<p>Having the data cleaned and grouped in the data frame, now we can proceed to create the choropleth map:<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>import pandas as pd\nimport plotly_express as px\nimport json\n\ndf = pd.read_csv(\"ca_pr_day_n.csv\")\ndf&#91;'category'] = ''\n\n#categorizing the number of cases and assign each category to each row\ndef set_cat(row):\n    if row&#91;'cases'] == 0:\n        return '0'\n    if row&#91;'cases'] &gt; 0 and row&#91;'cases'] &lt; 1001:\n        return '1 - 1,000'\n    if row&#91;'cases'] &gt; 1001 and row&#91;'cases'] &lt; 5001:\n        return '1,001 - 5,000'\n    if row&#91;'cases'] &gt; 5001 and row&#91;'cases'] &lt; 10001:\n        return '5,001 - 10,000'\n    if row&#91;'cases'] &gt; 10001 and row&#91;'cases'] &lt; 30001:\n        return '10,001 - 30,000'\n    if row&#91;'cases'] &gt; 30001 and row&#91;'cases'] &lt; 50001:\n        return '30,001 - 50,000'\n    if row&#91;'cases'] &gt; 50001:\n        return '50,001 and higher'\n\ndf = df.assign(category=df.apply(set_cat, axis=1))\n\n# assign mp to the geojson data\nwith open(\"canada_provinces.geojson\", \"r\") as geo:\n    mp = json.load(geo)\n\n# Create choropleth map\nfig = px.choropleth(df,\n                    locations=\"cartodb_id\",\n                    geojson=mp,\n                    featureidkey=\"properties.cartodb_id\",\n                    color=\"category\",\n                    color_discrete_map={\n                        '0': '#fffcfc',\n                        '1 - 1,000' : '#ffdbdb',\n                        '1,001 - 5,000' : '#ffbaba',\n                        '5,001 - 10,000' : '#ff9e9e',\n                        '10,001 - 30,000' : '#ff7373',\n                        '30,001 - 50,000' : '#ff4d4d',\n                        '50,001 and higher' : '#ff0d0d'},\n                    category_orders={\n                      'category' : &#91;\n                          '0',\n                          '1 - 1,000',\n                          '1,001 - 5,000',\n                          '5,001 - 10,000',\n                          '10,001 - 30,000',\n                          '30,001 - 50,000',\n                          '50,001 and higher'\n                      ]\n                    },\n                    animation_frame=\"timeframe\",\n                    scope='north america',\n                    title='&lt;b&gt;COVID-19 cases in Canadian provinces&lt;\/b&gt;',\n                    labels={'cases' : 'Number of Cases',\n                            'category' : 'Category'},\n                    hover_name='province',\n                    hover_data={\n                        'cases' : True,\n                        'cartodb_id' : False\n                    },\n                    # height=900,\n                    locationmode='geojson-id',\n                    )\n\n# Adjust map layout stylings\nfig.update_layout(\n    showlegend=True,\n    legend_title_text='&lt;b&gt;Total Number of Cases&lt;\/b&gt;',\n    font={\"size\": 16, \"color\": \"#808080\", \"family\" : \"calibri\"},\n    margin={\"r\":0,\"t\":40,\"l\":0,\"b\":0},\n    legend=dict(orientation='v'),\n    geo=dict(bgcolor='rgba(0,0,0,0)', lakecolor='#e0fffe')\n)\n\n# Adjust map geo options\nfig.update_geos(showcountries=False, showcoastlines=False,\n                showland=False, fitbounds=\"locations\",\n                subunitcolor='white')\nfig.show()\n<\/code><\/pre>\n\n\n\n<p>Few points worth to be mentioned here:<\/p>\n\n\n\n<ul class=\"wp-block-list\"><li>The more accurate you want to have your map (color filled layer), the more boundary polygons you may need in your geojson file which results in a bigger geojson file size.<\/li><li>When using geojson files, you will need to assign <code>featureidkey<\/code> to the ids associated with each province in the geojson file.<\/li><li>We set <code>category_orders<\/code> to have the legend sorted in our desired way. If not, the legend will appear with the same order as it is stored in the data frame.<\/li><li><code>animation_frame<\/code> is the feature that animates the map based on the frames it creates from the associated column. Without setting <code>animation_frame<\/code> we will have a static map.<\/li><li>The more number of frames can result in a heavier visual to render.<\/li><li>We can use HTML tags to adjust styling. For example: newline <code>&lt;br&gt;<\/code>, bold <code>&lt;b&gt;&lt;\/b&gt;<\/code>, italics <code>&lt;i&gt;&lt;\/i&gt;<\/code>, hyperlinks <code>&lt;a href='...'&gt;&lt;\/a&gt;<\/code>,Tags <code>&lt;em&gt;<\/code>, <code>&lt;sup&gt;<\/code>, <code>&lt;sub&gt;<\/code>, <code>&lt;span&gt;<\/code>.<\/li><li>If you are running the 32-bit version of Python and receive \u201c<strong>MemoryError<\/strong>\u201d, you probably need to uninstall the 32-bit version and then install the 64-bit Python.<\/li><\/ul>\n\n\n\n<h4 class=\"wp-block-heading\">Troubleshooting<\/h4>\n\n\n\n<p>By running the above code we will encounter an unusual behavior in the map as shown below:<\/p>\n\n\n\n<figure class=\"wp-block-image\"><img loading=\"lazy\" decoding=\"async\" width=\"904\" height=\"536\" src=\"http:\/\/mahshad.pro\/wp-content\/uploads\/choropleth-map-incorrect.gif\" alt=\"\" class=\"wp-image-132\"\/><figcaption> Map is not functioning as expected <\/figcaption><\/figure>\n\n\n\n<p>The reason for this issue is that since we are using discrete color setting, in other words mapping each color to each category, we need each frame to include all the possible categories so we will need to assign all of the categories to every single frame in the animation, so we can do the same with the following code:<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>catg = df&#91;'category'].unique()\ndts = df&#91;'timeframe'].unique()\nfor tf in dts:\n    for i in catg:\n        df = df.append({\n            'timeframe' : tf,\n            'cases' : 'N',\n            'cartodb_id' : '0',\n            'category' : i\n        }, ignore_index=True)<\/code><\/pre>\n\n\n\n<p>Above piece of code will add the distinct categories to each time frame so we can make sure that all of the time frames contain all of the categories.<\/p>\n\n\n\n<h4 class=\"wp-block-heading\">Final functional code<\/h4>\n\n\n\n<p>Eventually the full functional code will be as below:<\/p>\n\n\n\n<p><\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>import pandas as pd\nimport plotly_express as px\nimport json\n\ndf = pd.read_csv(\"ca_pr_day_n.csv\")\ndf&#91;'category'] = ''\n\n# Categorizing the number of cases and assign each category to each row\ndef set_cat(row):\n    if row&#91;'cases'] == 0:\n        return '0'\n    if row&#91;'cases'] &gt; 0 and row&#91;'cases'] &lt; 1001:\n        return '1 - 1,000'\n    if row&#91;'cases'] &gt; 1001 and row&#91;'cases'] &lt; 5001:\n        return '1,001 - 5,000'\n    if row&#91;'cases'] &gt; 5001 and row&#91;'cases'] &lt; 10001:\n        return '5,001 - 10,000'\n    if row&#91;'cases'] &gt; 10001 and row&#91;'cases'] &lt; 30001:\n        return '10,001 - 30,000'\n    if row&#91;'cases'] &gt; 30001 and row&#91;'cases'] &lt; 50001:\n        return '30,001 - 50,000'\n    if row&#91;'cases'] &gt; 50001:\n        return '50,001 and higher'\n\ndf = df.assign(category=df.apply(set_cat, axis=1))\n\n# Adds all available categories to each time frame\ncatg = df&#91;'category'].unique()\ndts = df&#91;'timeframe'].unique()\n\nfor tf in dts:\n    for i in catg:\n        df = df.append({\n            'timeframe' : tf,\n            'cases' : 'N',\n            'cartodb_id' : '0',\n            'category' : i\n        }, ignore_index=True)\n\n# Assign mp to the geojson data\nwith open(\"canada_provinces.geojson\", \"r\") as geo:\n    mp = json.load(geo)\n\n# Create choropleth map\nfig = px.choropleth(df,\n                    locations=\"cartodb_id\",\n                    geojson=mp,\n                    featureidkey=\"properties.cartodb_id\",\n                    color=\"category\",\n                    color_discrete_map={\n                        '0': '#fffcfc',\n                        '1 - 1,000' : '#ffdbdb',\n                        '1,001 - 5,000' : '#ffbaba',\n                        '5,001 - 10,000' : '#ff9e9e',\n                        '10,001 - 30,000' : '#ff7373',\n                        '30,001 - 50,000' : '#ff4d4d',\n                        '50,001 and higher' : '#ff0d0d'},\n                    category_orders={\n                      'category' : &#91;\n                          '0',\n                          '1 - 1,000',\n                          '1,001 - 5,000',\n                          '5,001 - 10,000',\n                          '10,001 - 30,000',\n                          '30,001 - 50,000',\n                          '50,001 and higher'\n                      ]\n                    },\n                    animation_frame=\"timeframe\",\n                    scope='north america',\n                    title='&lt;b&gt;COVID-19 cases in Canadian provinces&lt;\/b&gt;',\n                    labels={'cases' : 'Number of Cases',\n                            'category' : 'Category'},\n                    hover_name='province',\n                    hover_data={\n                        'cases' : True,\n                        'cartodb_id' : False\n                    },\n                    # height=900,\n                    locationmode='geojson-id',\n                    )\n\n# Adjust map layout styling\nfig.update_layout(\n    showlegend=True,\n    legend_title_text='&lt;b&gt;Total Number of Cases&lt;\/b&gt;',\n    font={\"size\": 16, \"color\": \"#808080\", \"family\" : \"calibri\"},\n    margin={\"r\":0,\"t\":40,\"l\":0,\"b\":0},\n    legend=dict(orientation='v'),\n    geo=dict(bgcolor='rgba(0,0,0,0)', lakecolor='#e0fffe')\n)\n\n# Adjust map geo options\nfig.update_geos(showcountries=False, showcoastlines=False,\n                showland=False, fitbounds=\"locations\",\n                subunitcolor='white')\nfig.show()\n<\/code><\/pre>\n\n\n\n<p>The outcome will look and work like this:<\/p>\n\n\n\n<figure class=\"wp-block-image\"><img loading=\"lazy\" decoding=\"async\" width=\"1024\" height=\"566\" src=\"http:\/\/mahshad.pro\/wp-content\/uploads\/choropleth-map-animated-1024x566.gif\" alt=\"\" class=\"wp-image-133\" srcset=\"https:\/\/mahshad.pro\/wp-content\/uploads\/choropleth-map-animated-1024x566.gif 1024w, https:\/\/mahshad.pro\/wp-content\/uploads\/choropleth-map-animated-300x166.gif 300w, https:\/\/mahshad.pro\/wp-content\/uploads\/choropleth-map-animated-768x424.gif 768w, https:\/\/mahshad.pro\/wp-content\/uploads\/choropleth-map-animated-400x221.gif 400w, https:\/\/mahshad.pro\/wp-content\/uploads\/choropleth-map-animated-800x442.gif 800w\" sizes=\"auto, (max-width: 1024px) 100vw, 1024px\" \/><figcaption> Final animated choropleth map with discrete colors <\/figcaption><\/figure>\n\n\n\n<p>You can check the outcome <a href=\"https:\/\/mahshadn.github.io\/canadian_map\/\" rel=\"noreferrer noopener\" target=\"_blank\">here<\/a>. All the files associated with this tutorial is available on my <a href=\"https:\/\/github.com\/Mahshadn\/canadian_map\" rel=\"noreferrer noopener\" target=\"_blank\">GitHub<\/a>.<\/p>\n\n\n\n<p>For sure there are other procedures available to make such maps or do the same with more optimized coding. Please feel free to leave a comment and share your opinion\/codes.<\/p>\n\n\n\n<h4 class=\"wp-block-heading\">Conclusion<\/h4>\n\n\n\n<p>Plotly Express, as a high-level interface to plotly, which has made it convenient to visualize data in Python. However, in Plotly\u2019s <a href=\"https:\/\/plotly.com\/python\/graph-objects\/\" rel=\"noreferrer noopener\" target=\"_blank\">Graph Objects<\/a> we can have a higher degree of freedom in creating more complex visuals or make more manipulations, but requires more lines of codes and greater effort.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>How to create an animated choropleth map with discrete colors using Python and Plotly. Visualizing COVID-19 data in Canadian provinces<\/p>\n","protected":false},"author":1,"featured_media":135,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"_monsterinsights_skip_tracking":false,"_monsterinsights_sitenote_active":false,"_monsterinsights_sitenote_note":"","_monsterinsights_sitenote_category":0,"footnotes":""},"categories":[28,40],"tags":[39,34,31,32,37,30],"class_list":["post-126","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-tutorials","category-works","tag-animated-choropleth-map","tag-choropleth-map","tag-data-analytics","tag-data-visualization","tag-plotly","tag-python","has-thumbnail"],"_links":{"self":[{"href":"https:\/\/mahshad.pro\/index.php?rest_route=\/wp\/v2\/posts\/126","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/mahshad.pro\/index.php?rest_route=\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/mahshad.pro\/index.php?rest_route=\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/mahshad.pro\/index.php?rest_route=\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/mahshad.pro\/index.php?rest_route=%2Fwp%2Fv2%2Fcomments&post=126"}],"version-history":[{"count":15,"href":"https:\/\/mahshad.pro\/index.php?rest_route=\/wp\/v2\/posts\/126\/revisions"}],"predecessor-version":[{"id":395,"href":"https:\/\/mahshad.pro\/index.php?rest_route=\/wp\/v2\/posts\/126\/revisions\/395"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/mahshad.pro\/index.php?rest_route=\/wp\/v2\/media\/135"}],"wp:attachment":[{"href":"https:\/\/mahshad.pro\/index.php?rest_route=%2Fwp%2Fv2%2Fmedia&parent=126"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/mahshad.pro\/index.php?rest_route=%2Fwp%2Fv2%2Fcategories&post=126"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/mahshad.pro\/index.php?rest_route=%2Fwp%2Fv2%2Ftags&post=126"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}