Generate documents with Jinja2 in Python¶

There are cases where you record data in a number of data sources, and then need to generate documentation from that data. One case I get involved in from time-to-time is to generate documentation from data collected in various data sources. Rather than copying-and-pasting, we can use code to dynmically generate documentation. In this post, I will show you some examples in how we can use jinja2 to generate markdown in Python.

Examples¶

These examples will all generate markdown. This allows the generated files to be converted to pdf or any other format for that matter. Regardless of the target format, the purpose of these examples is to show you the power of Jinja, and how you can use the templates to generate any format your heart desires.

Just show the table¶

We want to show a simple markdown table. This template will parse the data, and show the fields in a table.

Jinja

|**Category**|**Finding id**|**Finding**|**Severity**|**Recommendation**|
|------------|--------------|-----------|------------|------------------|
{% for i in data -%}
|{{ i['category'] }}|{{ i['finding_id'] }}|{{ i['finding'] }}|{{ i['severity'] }}|{{ i['recommendation'] }}|
{% endfor %}

Result

Category	Finding id	Finding	Severity	Recommendation
Access Control	F001	Weak password policy	High	Enforce strong password policies with complexity requirements.
Network Security	F002	Open ports detected	Medium	Restrict unnecessary open ports and use a firewall.
Data Protection	F003	Sensitive data stored in plaintext	Critical	Implement encryption for sensitive data at rest and in transit.
Access Control	F004	Excessive user privileges	Medium	Review and reduce user access rights based on roles.
System Configuration	F005	Unpatched vulnerabilities	High	Apply security patches and updates regularly.
Application Security	F006	SQL Injection vulnerability	Critical	Use parameterized queries and input validation.
Network Security	F007	Weak encryption algorithms	High	Replace outdated encryption protocols with strong ones such as AES-256.
Endpoint Security	F008	Antivirus not installed	Medium	Deploy and maintain up-to-date antivirus software on endpoints.
Incident Response	F009	Lack of incident response plan	Low	Develop and test an incident response plan.
Access Control	F010	Inactive accounts not disabled	Medium	Implement a process to disable or delete inactive accounts.

Show me the counts¶

We want to see a list of all the categories, and a count against each of them. This can be useful to identify which of the categories may have the biggest impact, hence needs the most attention.

Jinja

{# Initialize category counts #}
{%- set category_counts = {} -%}
{%- for item in data -%}
    {%- set category = item['category'] -%}
    {%- if category_counts[category] is not defined -%}
        {%- set _ = category_counts.update({category: 1}) -%}
    {%- else -%}
        {%- set _ = category_counts.update({category: category_counts[category] + 1}) -%}
    {%- endif -%}
{%- endfor -%}

| **Category**       | **Count** |
|--------------------|----------|
{% for category, count in category_counts.items() -%}
| {{ category }}     | {{ count }} |
{% endfor %}

Result

Category	Count
Access Control	3
Network Security	2
Data Protection	1
System Configuration	1
Application Security	1
Endpoint Security	1
Incident Response	1

Detailed report¶

We want to now break the original list of findings by grouping them by category, and also tweaking the severity to be an icon that we pull from shields.io.

Jinja

{% set unique_categories = data | map(attribute='category') | map('default', '') | unique | list %}

{%- macro severityicon(text) -%}
    {{ '![icon](https://img.shields.io/badge/Critical-black)' if text == 'Critical' else 
       '![icon](https://img.shields.io/badge/High-red)' if text == 'High' else 
       '![icon](https://img.shields.io/badge/Medium-yellow)' if text == 'Medium' else 
       '![icon](https://img.shields.io/badge/Low-green)' if text == 'Low' else 
       '![icon](https://img.shields.io/badge/Unknown-blue)' }}
{%- endmacro -%}

|**Category**|**Finding**|**Severity**|**Recommendation**|
|--|--|--|--|
{% for c in unique_categories %}|**{{ c }}**||||
{% for x in data if x['category'] == c -%}
||`{{ x['finding_id'] }}` - {{ x['finding'] }}|{{ severityicon(x['severity']) }}|{{ x['recommendation'] }}|
{% endfor -%}
{% endfor %}

Result

Category	Finding	Recommendation
Access Control
	`F001` - Weak password policy	Enforce strong password policies with complexity requirements.
	`F004` - Excessive user privileges	Review and reduce user access rights based on roles.
	`F010` - Inactive accounts not disabled	Implement a process to disable or delete inactive accounts.
Network Security
	`F002` - Open ports detected	Restrict unnecessary open ports and use a firewall.
	`F007` - Weak encryption algorithms	Replace outdated encryption protocols with strong ones such as AES-256.
Data Protection
	`F003` - Sensitive data stored in plaintext	Implement encryption for sensitive data at rest and in transit.
System Configuration
	`F005` - Unpatched vulnerabilities	Apply security patches and updates regularly.
Application Security
	`F006` - SQL Injection vulnerability	Use parameterized queries and input validation.
Endpoint Security
	`F008` - Antivirus not installed	Deploy and maintain up-to-date antivirus software on endpoints.
Incident Response
	`F009` - Lack of incident response plan	Develop and test an incident response plan.

Just the criticals¶

In this example, we will filter only on the critical findings, and show them in a more readable format.

Jinja

{%- for x in data -%}
{% if x['severity'] == 'Critical' -%}
#### {{ x['finding_id'] }} - {{ x['finding'] }}

A {{ x['severity'] }} finding (`{{ x['finding_id'] }}`) was recorded.

**Finding**

> {{ x['finding'] }}

**Recommendation**

> {{ x['recommendation']}}
{% endif %}
{% endfor %}

Result

F003 - Sensitive data stored in plaintext¶

A Critical finding (F003) was recorded.

Finding

Sensitive data stored in plaintext

Recommendation

Implement encryption for sensitive data at rest and in transit.

F006 - SQL Injection vulnerability¶

A Critical finding (F006) was recorded.

Finding

SQL Injection vulnerability

Recommendation

Use parameterized queries and input validation.

All the code¶

security_findings.csv

The security findings file is a simple CSV file, with only a handful of columns. This can be expanded with a lot more data. You can also read json files - or anything for that matter. As long as you can get the data into a python dictionary or a list, you can render the data in Jinja2.

category,finding_id,finding,severity,recommendation
Access Control,F001,Weak password policy,High,Enforce strong password policies with complexity requirements.
Network Security,F002,Open ports detected,Medium,Restrict unnecessary open ports and use a firewall.
Data Protection,F003,Sensitive data stored in plaintext,Critical,Implement encryption for sensitive data at rest and in transit.
Access Control,F004,Excessive user privileges,Medium,Review and reduce user access rights based on roles.
System Configuration,F005,Unpatched vulnerabilities,High,Apply security patches and updates regularly.
Application Security,F006,SQL Injection vulnerability,Critical,Use parameterized queries and input validation.
Network Security,F007,Weak encryption algorithms,High,Replace outdated encryption protocols with strong ones such as AES-256.
Endpoint Security,F008,Antivirus not installed,Medium,Deploy and maintain up-to-date antivirus software on endpoints.
Incident Response,F009,Lack of incident response plan,Low,Develop and test an incident response plan.
Access Control,F010,Inactive accounts not disabled,Medium,Implement a process to disable or delete inactive accounts.

render.py

The python script utilising jinja2 is quite simple. It reads the csv file, passes that data structure to the template, and let the template do the heavy lifting.

import jinja2
import csv

def readCSV(file):
    print(f"Read {file}...")
    output = []
    with open(file, 'rt',encoding='utf-8') as csvfile:      
        reader = csv.DictReader(csvfile)
        for row in reader:
            output.append(row)
    print(f" --> Read {len(output)} records.")
    return output

def render_jinja(data,template,output):
    template_dir = '.'  # change this if your templates are somewhere else
    env = jinja2.Environment(loader=jinja2.FileSystemLoader(template_dir)).get_template(template)
    result = env.render(data = data)
    print(f"Writing {output}")
    with open(output,'wt',encoding='utf-8') as q:
        q.write(result)

x = readCSV('security_findings.csv')
render_jinja(x,"template.md","output.md")

template.md

This basic template has a number of neat tricks included. It shows you how to parse the raw data, and generate the tables you've seen above.

{% set unique_categories = data | map(attribute='category') | map('default', '') | unique | list %}

{%- macro severityicon(text) -%}
    {{ '![icon](https://img.shields.io/badge/Critical-black)' if text == 'Critical' else 
       '![icon](https://img.shields.io/badge/High-red)' if text == 'High' else 
       '![icon](https://img.shields.io/badge/Medium-yellow)' if text == 'Medium' else 
       '![icon](https://img.shields.io/badge/Low-green)' if text == 'Low' else 
       '![icon](https://img.shields.io/badge/Unknown-blue)' }}
{%- endmacro -%}

# Example Security Finding report

This is an example of our security findings.

## Basic CSV data dump

First, let me show you a full table of all the raw data we collected in the CSV file.  No manupulation or anything is done to the data - what you see is what you get.

|**Category**|**Finding id**|**Finding**|**Severity**|**Recommendation**|
|------------|--------------|-----------|------------|------------------|
{% for i in data -%}
|{{ i['category'] }}|{{ i['finding_id'] }}|{{ i['finding'] }}|{{ i['severity'] }}|{{ i['recommendation'] }}|
{% endfor %}


## Summary data

We can parse through the data, and generate a count of the number of findings in each category.

{# Initialize category counts #}
{%- set category_counts = {} -%}
{%- for item in data -%}
    {%- set category = item['category'] -%}
    {%- if category_counts[category] is not defined -%}
        {%- set _ = category_counts.update({category: 1}) -%}
    {%- else -%}
        {%- set _ = category_counts.update({category: category_counts[category] + 1}) -%}
    {%- endif -%}
{%- endfor -%}

| **Category**       | **Count** |
|--------------------|----------|
{% for category, count in category_counts.items() -%}
| {{ category }}     | {{ count }} |
{% endfor %}

## More advanced

Showing the data by a category is a much better way to go.  Let's group by the category, and make the severity more colourful.

|**Category**|**Finding**|**Severity**|**Recommendation**|
|--|--|--|--|
{% for c in unique_categories %}|**{{ c }}**||||
{% for x in data if x['category'] == c -%}
||`{{ x['finding_id'] }}` - {{ x['finding'] }}|{{ severityicon(x['severity']) }}|{{ x['recommendation'] }}|
{% endfor -%}
{% endfor %}

## Only the criticals

In this view, we will show only the critical findings.

{%- for x in data -%}
{% if x['severity'] == 'Critical' -%}
### {{ x['finding_id'] }} - {{ x['finding'] }}

A {{ x['severity'] }} finding (`{{ x['finding_id'] }}`) was recorded.

**Finding**

> {{ x['finding'] }}

**Recommendation**

> {{ x['recommendation']}}
{% endif %}
{% endfor %}

Generate documents with Jinja2 in Python¶

Examples¶

Just show the table¶

Show me the counts¶

Detailed report¶

Just the criticals¶

F003 - Sensitive data stored in plaintext¶

F006 - SQL Injection vulnerability¶

All the code¶

More Information¶