A great place to find textbooks and instructional videos for programming languages.
Safari Academic Edition includes more than 35K book titles plus 30K+ hours of video, proven learning paths, case studies, interactive tutorials, audio books, and videos from O’Reilly's global conferences. Includes access to exclusive O’Reilly content and resources from more than 200 other publishers.
Special Access Notes: Deep linking instructions for Safari ebooks and videos
Peer-reviewed tutorials targeting historians, but broadly useful. Browsing the tutorials also gives you a good sense of the range of digital humanities activities.
A collection of syllabi broadly construed under the umbrella of 'digital humanities', offering a variety of disciplinary and institutional perspectives
A getting started and do-it-yourself tutorial for XML (eXtensible Markup Language), specifically relating to the widely-accepted standards of the Text Encoding Initiative
A survey of popular tools and resources grouped by general task associated with the Digital Humanities, including workflow management, understanding fair use, design, and specific apps for specific activities
The easiest option for OCR of PDFs is with Adobe Acrobat (which all PSU students, faculty, and staff has access to via our institutional subscription to Adobe Creative Cloud).
Tesseract is the standard open source software for OCR. Works with over 100 languages by default and can be trained on new languages as well as to improve accuracy. Good documentation, but requires working on the command line (the project page does link to a list of programs/apps for Tesseract with a user interface).
Kraken is open source software that specializes in non-Latin alphabets, including bidirectional, right-to-left, and top-to-bottom scripts. kraken is Python-based and runs on the command line.