A great place to find textbooks and instructional videos for programming languages.
Safari Academic Edition includes more than 35K book titles plus 30K+ hours of video, proven learning paths, case studies, interactive tutorials, audio books, and videos from O’Reilly's global conferences. Includes access to exclusive O’Reilly content and resources from more than 200 other publishers.
Peer-reviewed tutorials targeting historians, but broadly useful. Browsing the tutorials also gives you a good sense of the range of digital humanities activities.
"#dariahTeach is a platform for Open Educational Resources (OER) for Digital Arts and Humanities educators and students, but also beyond this aiming at Higher Education across a spectrum of disciplines, at teachers and trainers engaged in the digital transformation of programme content and learning methods. #dariaTeach has two key objectives: sharing and reuse, thus developing a place for people to publish their teaching material and for others to use it in their own teaching" (from the website's about page).
A collection of syllabi broadly construed under the umbrella of 'digital humanities', offering a variety of disciplinary and institutional perspectives
A getting started and do-it-yourself tutorial for XML (eXtensible Markup Language), specifically relating to the widely-accepted standards of the Text Encoding Initiative
A survey of popular tools and resources grouped by general task associated with the Digital Humanities, including workflow management, understanding fair use, design, and specific apps for specific activities
The easiest option for OCR of PDFs is with Adobe Acrobat (which all PSU students, faculty, and staff has access to via our institutional subscription to Adobe Creative Cloud).
Tesseract is the standard open source software for OCR. Works with over 100 languages by default and can be trained on new languages as well as to improve accuracy. Good documentation, but requires working on the command line (the project page does link to a list of programs/apps for Tesseract with a user interface).
Kraken is open source software that specializes in non-Latin alphabets, including bidirectional, right-to-left, and top-to-bottom scripts. kraken is Python-based and runs on the command line.