hadoop - Better way to store password during oozie spark job workflow -


i have oozie workflow executes spark job, needs usernames , passwords connect various servers. right pass in workflow.xml arguments: username password

it's (of course) bad way makes password visible. standard way obfuscate password in such case?

thanks!

sqoop interesting case, can see in documentation:

  • at first there --password command-line option, followed password in plain text (yuck!)
  • then --password-file introduced, followed file contains password; it's clear improvement because
    (a) when running on edge node, password not visible running ps command
    (b) when running in oozie, can upload file once in hdfs, tell oozie download cwd of yarn container running job, <file> option, , password not visible inspects job definition
    ---- don't forget restrict access damn file, both on edge node , on hdfs, otherwise password still compromised ----
  • finally, optional credential store introduced in hadoop, , sqoop supports natively (although have issue of protecting password use connect credential store...)

similarly, spark (and custom java / scala / python app) suggest store sensitive information in "properties" file, restrict access file, pass command-line argument program.

it make life easier if have distinct dev / test / prod environments -- oozie script , "properties" filename same, actual props environment-specific.


Comments