もりはやメモφ(・ω・ )

インフラなエンジニアからSREへ

embulkのfilter pluginが定義ミスで読まれなかったメモ

これは

ものすごいしょーもないミスでembulkのfilterプラグインが読まれなかった事象に遭遇したのでメモ

何がどうだった

filters: とすべきところを filter: とsなしで記載した結果

  • embulkの処理自体は実行される(書式エラーとして判定されない)
  • filterプラグインは読まれない

という事象になった。

定義

filter-pluginが読まれなかった定義( filter: )

exec:
  min_output_tasks: 1
  max_threads: 1

in:
  type: sftp
  host: {{ env.SFTP_HOST }}
  user: {{ env.SFTP_USER }}
  password: {{ env.SFTP_PASSWORD }}
...
filter:
  - type: column
    columns:
      - { name: send_date }
      - { name: message_name }
      - { name: message_type }
      - { name: message_text }
      - { name: send_status }
      - { name: mobile_message_tracking_id }
      - { name: kokyaku_id }
  - type: row
    where: |-
      (
        send_date IS NOT NULL AND
        message_name IS NOT NULL AND
        message_text IS NOT NULL AND
        kokyaku_id IS NOT NULL
      )

out:
  type: postgresql
  host: {{ env.pg_host }} 
  user: {{ env.pg_user }}
  password: {{ env.pg_pass }}
  database: {{ env.database }}
  schema: {{ env.schema }}
.... 

filter-pluginが読まれる正しい定義( filters: )

exec:
  min_output_tasks: 1
  max_threads: 1

in:
  type: sftp
  host: {{ env.SFTP_HOST }}
  user: {{ env.SFTP_USER }}
  password: {{ env.SFTP_PASSWORD }}
...
filters:
  - type: column
    columns:
      - { name: send_date }
      - { name: message_name }
      - { name: message_type }
      - { name: message_text }
      - { name: send_status }
      - { name: mobile_message_tracking_id }
      - { name: kokyaku_id }
  - type: row
    where: |-
      (
        send_date IS NOT NULL AND
        message_name IS NOT NULL AND
        message_text IS NOT NULL AND
        kokyaku_id IS NOT NULL
      )

out:
  type: postgresql
  host: {{ env.pg_host }} 
  user: {{ env.pg_user }}
  password: {{ env.pg_pass }}
  database: {{ env.database }}
  schema: {{ env.schema }}
.... 

ログ

filter-pluginが読まれなかった時のログ

2019-05-15 07:14:41.720 +0900: Embulk v0.9.9
2019-05-15 07:14:42.421 +0900 [WARN] (main): DEPRECATION: JRuby org.jruby.embed.ScriptingContainer is directly injected.
2019-05-15 07:14:44.598 +0900 [INFO] (main): Gem's home and path are set by default: "/home/embulk/.embulk/lib/gems"
2019-05-15 07:14:46.353 +0900 [INFO] (main): Started Embulk v0.9.9
2019-05-15 07:14:46.414 +0900 [INFO] (0001:transaction): Loaded plugin embulk-input-sftp (0.2.12)
2019-05-15 07:14:46.436 +0900 [INFO] (0001:transaction): Loaded plugin embulk-output-postgresql (0.8.1)
2019-05-15 07:14:46.476 +0900 [INFO] (0001:transaction): Connecting to sftp://myuser:***@hogehoge.com:22/
2019-05-15 07:14:46.479 +0900 [INFO] (0001:transaction): Getting to download file list

filter-pluginが正しい書式で読まれた時のログ oaded plugin embulk-filter-column とちゃんと出てる

2019-05-15 07:34:11.426 +0900: Embulk v0.9.9
2019-05-15 07:34:12.139 +0900 [WARN] (main): DEPRECATION: JRuby org.jruby.embed.ScriptingContainer is directly injected.
2019-05-15 07:34:14.306 +0900 [INFO] (main): Gem's home and path are set by default: "/home/embulk/.embulk/lib/gems"
2019-05-15 07:34:16.017 +0900 [INFO] (main): Started Embulk v0.9.9
2019-05-15 07:34:16.075 +0900 [INFO] (0001:transaction): Loaded plugin embulk-input-sftp (0.2.12)
2019-05-15 07:34:16.098 +0900 [INFO] (0001:transaction): Loaded plugin embulk-output-postgresql (0.8.1)
2019-05-15 07:34:16.127 +0900 [INFO] (0001:transaction): Loaded plugin embulk-filter-column (0.7.1)
2019-05-15 07:34:16.145 +0900 [INFO] (0001:transaction): Loaded plugin embulk-filter-row (0.5.1)
2019-05-15 07:34:16.174 +0900 [INFO] (0001:transaction): Connecting to sftp://myuser:***@hogehoge.com:22/

今後に向けて

慣れで下手に手打ちするとしょーもないミスを産むので、ちゃんと動いてるコードをコピペするのも大事だなぁ。