Skip to content

Windows worker nodes

Worker nodes, those doing the actual conversion, are windows servers. Most conversions are related to office applications, that fact together with our actual main converter, Neevia, made us to focus our code on Windows. Nevertheless the coding has been done using Python 3 and it should be quite agnostic allowing future deployments on different OS or adding more converters e.g. poster conversion, or OCR to text.

TOC:

Patching

Patching the nodes is controlled by a NSC being reused from other IT-CDA-IC activity. We us a locally managed NSC: NSS: AVC & NSC: Recording and AV workflow server.

MS hotfixes will be delpoyed by CMF but action needs to be triggered by an administrator. This usually involved a reboot. Please remember to patch the servers one by one, so the service can continue. Workflow: - Stop converter

(doconverter) C:\doconverter\doconverter\doconverter\engines>python converter_daemon.py --s
Namespace(archive=0, basic_monitoring=False, computer=None, nprocesses=2, remove_stopper=False, sendtaskid=0, stopper=True, timetosleep=2)
[2017-05-15 17:44:50,590 converter_daemon.py:83 - <module>() ] Creates stopper file and exists!
- Apply patches via CMF & reboot - Log in again with service account and start converter. Open a console and do:

C:\doconverter\doconverter\doconverter\engines> workon doconverter
(doconverter) C:\doconverter\doconverter\doconverter\engines>python converter_daemon.py --r
Namespace(archive=0, basic_monitoring=False, computer=None, nprocesses=2, remove_stopper=True, sendtaskid=0, stopper=False, timetosleep=2)
[2017-05-15 17:46:16,245 converter_daemon.py:92 - <module>() ] remove stopper from file system and exists!

(doconverter) C:\doconverter\doconverter\doconverter\engines>python converter_daemon.py --n 4

Where is doconverter software in a Windows node

All files related to doconverter will be located at c:\doconverter

doconverter_dir.png

Just a sum-up of important directories: - cert: certificate used for https - doconverter: where the doconverter code is deployed see newversiongit.md - files: contents a set of files for functional testing - logs - testing: used for functional testing - config : doconverter.ini it configures the doconverter & logging.conf configuration for the logging module.

# config: doconverter ini

A brief explanation of doconverter configuration:

[default]
# Which files can be uploaded 
extensions_all=doc,docx,ppt,pptx,xlsx,tif,htm,txt,png,jpg
# Where you are mounting your EOS volume: \\cernbox-smb.cern.ch\eos\project\d\doconverter
prefix_dir=Y:\
archival_dir=Y:\
# CA certificate used for https
ca_bundle=c:\doconverter\cert\COMODO_OV_SHA-256_bundle.crt
# Worker nodes available
servers=doconverter01,doconverter02
# Formats that the worker node provides: input and output formats
doconverter01=doc,docx,ppt,pptx,xlsx,tif,htm,txt,png,jpg,pdf,ps,pdfa
doconverter02=tif,htm,txt,png,jpg,pdf,ps,pdfa
[manager]
# Possible converters, ',' separated
converters=Neevia
# Stopper file, if in the filesystem the converter will stop nicely...
stopper=c:\doconverter\noconverter.txt
[monitor]
#To whom warn in case of alerts
emails=ruben.gaspar.aparicio@cern.ch,conversion-admins@cern.ch
tasksalert=50
smtpserver=cernmx.cern.ch
# Converter configuration one per 'converters' defined
[Neevia]
extensions_allowed=doc,docx,ppt,pptx,xlsx,tif,htm,txt,png,jpg
output_allowed=pdf,png,ps,pdfa
type=windows
exe=dConverter.exe
[database]
# Logging database
host=dbod-docprod.cern.ch
port=6600
db=doconverter
user=postgresql://doconverter
password=XXXXXXX
# Testing settings for functional testing
[test]
url=https://dev-doconverter.web.cern.ch/doconverter/api/v1.0/uploads
url_response=http://conv-test02:5000/doconverter/api/v1.0/received
diresponse=c:\doconverter\testing
files=c:\doconverter\files\wordfile.docx,c:\doconverter\files\excelfile.xlsx,c:\doconverter\files\htmfile.htm,
    c:\doconverter\files\picture.jpg,c:\doconverter\files\pngfile.png,c:\doconverter\files\powerpointfile.pptx

Schedule Tasks

Several Tasks run by Windows Task scheduler, please adapt depending on settings at your server and check a production server for latest schedule tasks:

  • Log rotation: done externally due to locking issues on Windows with multiprocess logging. It runs every day
Program: C:\doconverter\doconverter\logrotation_doconverter.cmd 
Arguments: c:\doconverter\logs\api.log DAY 4 doconverter
Schedule: every day at midnight
Run: When use is logged in. Identity: cdsconv
  • Archival task: move of old tasks /C "c:\doconverter\doconverter\venv\Scripts\activate & python c:\doconverter\doconverter\doconverter\engines\converter_daemon.py --a 1"

    Program: cmd 
    Arguments: /C "c:\doconverter\doconverter\venv\Scripts\activate & python c:\doconverter\doconverter\doconverter\engines\converter_daemon.py --a 1"
    Start in: C:\doconverter\doconverter\doconverter\engines
    Schedule: every day
    Run: When use is logged in. Identity: cdsconv
    

  • Monitoring task

Program: cmd 
Arguments: /C "c:\doconverter\doconverter\venv\Scripts\activate & python converter_daemon.py --m"
Start in: C:\doconverter\doconverter\doconverter\engines
Schedule: every hour
Run: When use is logged in. Identity: cdsconv
  • Cleanup temp: mainly due to how Neevia works.
Program: \\cern.ch\dfs\Services\conversion\Scripts\cleanoldfilesandirs.cmd
Arguments: C:\Users\cdsconv\AppData\Local\Temp YES 7
Start in: C:\Users\cdsconv\AppData\Local\Temp
Schedule: daily
Run: When use is logged in. Identity: cdsconv
  • Cleanup orig: mainly due to how Neevia works.
Program: \\cern.ch\dfs\Services\conversion\Scripts\cleanoldfilesandirs.cmd
Arguments: c:\PROGRA~1\neevia.com\docConverterPro\DEF_FOLDERS\ORIG YES 2 
Start in: c:\PROGRA~1\neevia.com\docConverterPro\DEF_FOLDERS\ORIG
Schedule: daily
Run: When use is logged in. Identity: cdsconv
  • Cleanup of local archive files
Program: \\cern.ch\dfs\Services\conversion\Scripts\cleanoldfilesandirs.cmd
Arguments: C:\Users\cdsconv\cernbox\doconv01-test\archive\2017 YES 2 
Start in: C:\Users\cdsconv\cernbox\doconv01-test\archive\2017
Schedule: daily
Run: When use is logged in. Identity: cdsconv

Last update: January 9, 2022