Script to rename pdf papers to appropriate titles

BY IN Notes, Tutorials, Unix 2 COMMENTS

When you download the pdf version of a paper from an online journal, the pdf is typically named by default to be the journal name plus some id such as nrg2950.pdf or nihms551112.pdf and generally not too informative. I like to use a bash script with pdfinfo to rename pdfs to the paper’s title.

#!/bin/bash
for FILE in *.pdf ; do 
    TITLE=$(pdfinfo "$FILE" 2>/dev/null | grep Title: | sed 's/Title:[ ]*//')
    # if not empty, take title as file name
    if [ -n "$TITLE" ]; then
        # replace non-ASCII word chars to underscores
        TITLE=${TITLE//[^a-zA-Z0-9]/_}
        echo $TITLE
        mv "$FILE" "${TITLE}.pdf"  
    fi
done

You can modify this script to rename pdfs to a convention you prefer such as JOURNAL_DATE_FIRSTAUTHORLASTNAME_TITLE.pdf

2 Comments

  1. Jiao Chen |

    Hi, how to use this script? I tried the pdfinfo command, but the title is not the paper’s title.

    Reply
    • Jean |

      This script requires the paper title to be in the metadata ie. when you right click the document and select ‘Get Info’

      Reply

So, what do you think ?