shell

#shell#

已有2人关注此标签

内容分类

游客sz4konkwbrlv4

独享虚拟云主机shell连接

最基础的那个独享虚拟云主机能用shell连接吗

一码平川MACHEL

请问一下有没有人遇到过这个问题,有木有解决方法? 我使用了一下scrapy shell

请问一下有没有人遇到过这个问题,有木有解决方法?我使用了一下scrapy shell

k8s小能手

咨询个技术问题,我部署了一组pod,副本数为3,在更新容器镜像的之后,需要触发一条shell命令,如果写在启动后执行命令里面,会3个副本都执行,但只需执行一次,这样的话要如何做呢,有没有哪位大神给个思路

咨询个技术问题,我部署了一组pod,副本数为3,在更新容器镜像的之后,需要触发一条shell命令,如果写在启动后执行命令里面,会3个副本都执行,但只需执行一次,这样的话要如何做呢,有没有哪位大神给个思路

中间件小哥

19.轻量级分布式应用服务发布的应用,怎么登陆对应的服务器或者容器内查看具体问题?

轻量级分布式应用服务发布的应用,怎么登陆对应的服务器或者容器内查看具体问题?

小六码奴

在EMR中添加S3同步步骤

执行完所有步骤后,我想执行最后一步将S3数据复制到另一个存储桶。 我没有找到任何支持的运行shell命令的脚本 https://docs.aws.amazon.com/emr/latest/ReleaseGuide/emr-commandrunner.html 支持s3-dist-cp,如果允许我覆盖目标目录数据,我可以使用它。 我需要这样的东西: { action_on_failure = "CONTINUE" name = "copy s3 data" hadoop_jar_step = [{ args = ["bash" , " aws s3 sync s3://bucket1/data s3://bucket2/data"] jar = "command-runner.jar" }] }

小六码奴

使用docker exec执行主机上存在的shell脚本时出现问题

我正在尝试在AWS EMR集群的主节点上执行脚本。目的是创建一个新的conda env并将其链接到jupyter。我正在关注AWS的这个文档。问题是,无论脚本的内容是什么,我都会遇到同样的错误:bash: /home/hadoop/scripts/bootstrap.sh: No such file or directory执行时sudo docker exec jupyterhub bash /home/hadoop/scripts/bootstrap.sh。我确保sh文件位于正确的位置。 但是如果我将bootstrap.sh文件复制到容器内,然后运行相同的docker exec cmd,它就可以了。我在这里错过了什么?我已尝试使用带有以下条目的简单脚本,但它会抛出相同的错误: #!/bin/bashecho "Hello"该文件清楚地说: 内核安装在Docker容器中。完成此操作的最简单方法是使用安装命令创建bash脚本,将其保存到主节点,然后使用sudo docker exec jupyterhub script_name命令在jupyterhub容器中运行脚本。

情殇殇~

shell脚本中的kill命令无效

脚本是这样的: #!/bin/bash PID=$(ps -ef | grep xx.jar | grep -v grep | awk '{ print $2 }') echo Application is already stopped echo kill $PID kill $PID 文件名为stop.sh使用sh stop.sh运行后结束不掉进程,文件权限为777

小六码奴

如何在当前shell的上下文中执行Ruby系统调用

我正在执行rvm use ...一个Ruby脚本内部的调用: system "rvm use 2.5.5"当上述执行时,我明白了 RVM is not a function, selecting rubies with 'rvm use ...' will not work. You need to change your terminal emulator preferences to allow login shell.我很确定我是以登录shell身份登录的。在我打开的终端中使用普通的RVM命令没有问题。system命令是否在当前shell上获取或默认情况下是否使用其他内容?如果它使用其他东西,那么让Ruby在当前shell的上下文中运行命令的最佳方法是什么?

小六码奴

向Shell - 如何使服务器打印发送到客户端的多条消息

我的反向shell只会在第一次打印“ls”命令的输出,我怎样才能在服务器中打印多次我想要的? require 'socket'require 'open3' def createClient(hostname, port) s = TCPSocket.new hostname, port while line = s.gets if line == "exit" s.close end stdin, stdout, stderr, wait_thr = Open3.popen3(line) s.puts("#{stdout.read}") end end createClient("127.0.0.1", 9090) require 'socket' def createServer(hostname, port) server = TCPServer.new(hostname, port) client = server.accept loop do message = gets.chomp if message == "exit" break end client.puts(message) while line = client.gets puts line end end client.close end createServer("127.0.0.1", 9090)我期待服务器打印客户端处理的所有ls命令,但它只打印第一个命令。

k8s小能手

Jenkins - 无法创建用户数据目录:/ var / lib / jenkins / snap / docker /:只读文件系统

我正在按照以下文档从Jenkins部署到Kubernetes。我在自己的VM中安装了jenkins。但是在运行构建时会出现以下错误 docker build -t myregistry.azurecr.io/my-svc:latest7 ./my-svc create user data directory: /var/lib/jenkins/snap/docker/321: Read-only file system Build step 'Execute shell' marked build as failureFinished: FAILURE但是所有目录都有jenkins用作所有者,我不确定为什么它会进入权限问题。 poc@poc-ubuntu:~$ ls -ltr /var/lib/drwxr-xr-x 18 jenkins jenkins 4096 Feb 18 16:45 jenkins

一码平川MACHEL

并发读取大文件

我正在创建一个python管道来处理非常大的二进制文件(即50+ GB)。它们是BAM文件,一种用于表示基因组的格式。我的脚本目前受到两个计算量很大的子进程调用的瓶颈。 这两个命令占用了每次运行管道的约80%的计算时间,因此我需要找到一种方法来加速这个过程。他们从同一个文件中读取数据。我想知道最好的路线,以提高效率。基本上,是否有一种特殊的并发风格才能发挥最佳作用?或者还有其他一些有趣的方法吗? 命令: subprocess.call('samtools view -b -f 68 {}> {} _ unmapped_one.bam'.format(self.file_path,self.file_prefix),shell = True) subprocess.call('samtools view -b -f 132 {}> {} _unmapped_two.bam'.format(self.file_path,self.file_prefix),shell = True)

一码平川MACHEL

无法从.bash_profile中删除env变量

关于.bash_profile和Pycharm 的使用我有一些问题。我正在使用mac OS X.我使用带有基本解释器的virtualenv在新环境中创建了一个关于pycharm的新项目/usr/local/bin/python3.5。 步骤1:我然后从我的Mac OS终端接入的.bash_profile文件和出口2个变量:DB_USER和DB_PASS作为my_db_user和my_db_pass分别。 第2步:使用Pycharm,我导入了os,然后继续使用打印出2个变量os.environ.get()。使用pycharm(F10)运行.py文件返回my_db_user和my_db_pass。 当我决定创建两个新的变量test user,并test pass在虚拟环境中,我开始激活我VENV(venv/bin/activate在pycharm的外壳)。然后,我删除了我在STEP 1中所做的更改。 但是,运行使用pycharm(F10)中的.py仍返回my_db_user和my_db_pass,而不是test user和test pass(我已经删除my_db_user,并my_db_pass让我不知道它是从哪里来了!)。最重要的是,当我使用python test.py在shell上运行python文件时,它返回(None,None)而不是我想要的test user和test pass。 我需要帮助来解决这个问题,以便os.environ.get()返回我想要的输出。一个可能的原因是我可能会对pycharm,pycharm和终端中的shell如何交互感到困惑。 import os user = os.environ.get('DB_USER')password = os.environ.get('DB_PASS') print(user,password)

舟马劳顿

odps sql shell 出错

FAILED: ODPS-0130071:[0,0] Semantic analysis exception - physical plan generation failed: java.lang.RuntimeException: java.lang.AssertionError: Internal error: Error while applying rule Enforcer, args [rel#77073715:AbstractConverter.ODPS.[0, 1].hash[0, 1],JoinHasher(input=rel#77073627:Subset#2743.ODPS.[].single,convention=ODPS,sort=[0, 1],dist=hash[0, 1],JoinHasher), rel#77109892:OdpsPhysicalProject.ODPS.[].single(input=rel#77073607:Subset#2733.ODPS.[].single,i_id=$0,_col3=CAST(2.4E1):DOUBLE)]

seanlook

怎么获取我的hbase所有支持的配置项

hbase 0.9x, 1.1.x, 1.2.x...等一系列版本,后面的版本往往还有新的配置项加进去,现在只能在官方的book里面查看,比如 https://issues.apache.org/jira/browse/HBASE-18226 有个选项 hbase.regionserver.hostname.disable.master.reversedns ,在官方文档根本查不到。 像我使用的是cdh版本,我怎么知道cdh版本1.2有没有合并上面说的1.4里面特性,这个配置能不能用。 hbase自身有查询所有选项的地方吗,hbase shell只看到 update_all_config, update_config,没看到查询的.hbase-site.xml 也只有部分常用的,不全

k8s小能手

Ansible:无法重新加载sysctl:sysctl:无法stat / proc / sys / net / bridge / bridge-nf-call-iptables:没有这样的文件或目录

我正在用ansible建立kubernetes集群。尝试启用内核IP路由时出现以下错误: Failed to reload sysctl: sysctl: cannot stat /proc/sys/net/bridge/bridge-nf-call-iptables: No such file or directory这是ansible的错误还是我的playbook有问题? file: site.yml description: Asentaa ja kaynnistaa kubernetes-klusterin riippuvuuksineen resources: - https://kubernetes.io/docs/setup/independent/install-kubeadm/ - http://michele.sciabarra.com/2018/02/12/devops/Kubernetes-with-KubeAdm-Ansible-Vagrant/ - https://docs.ansible.com/ansible/latest/modules/ - https://github.com/geerlingguy/ansible-role-kubernetes/blob/master/tasks/setup-RedHat.yml - https://docs.docker.com/install/linux/docker-ce/centos/ author: Tuomas Toivonen date: 30.12.2018 name: Asenna docker ja kubernetes hosts: k8s-machines become: true become_method: sudo roles: - common vars: ip_modules: - ip_vs - ip_vs_rr - ip_vs_wrr - ip_vs_sh - nf_conntrack_ipv4 tasks: - name: Poista swapfile tags: - os-settings mount: name: swap fstype: swap state: absent - name: Disabloi swap-muisti tags: - os-settings command: swapoff -a when: ansible_swaptotal_mb > 0 - name: Konfiguroi verkkoasetukset tags: - os-settings command: modprobe {{ item }} loop: "{{ ip_modules }}" - name: Modprobe tags: - os-settings lineinfile: path: "/etc/modules" line: "{{ item }}" create: yes state: present loop: "{{ ip_modules }}" - name: Iptables tags: - os-settings sysctl: name: "{{ item }}" value: 1 sysctl_set: yes state: present reload: yes loop: - 'net.bridge.bridge-nf-call-iptables' - 'net.bridge.bridge-nf-call-ip6tables' - name: Salli IP-reititys sysctl: name: net.ipv4.ip_forward value: 1 state: present reload: yes sysctl_set: yes - name: Lisaa docker-ce -repositorio tags: - repos yum_repository: name: docker-ce description: docker-ce baseurl: https://download.docker.com/linux/centos/7/x86_64/stable/ enabled: true gpgcheck: true repo_gpgcheck: true gpgkey: - https://download.docker.com/linux/centos/gpg state: present - name: Lisaa kubernetes -repositorio tags: - repos yum_repository: name: kubernetes description: kubernetes baseurl: https://packages.cloud.google.com/yum/repos/kubernetes-el7-x86_64 enabled: true gpgcheck: true repo_gpgcheck: true gpgkey: - https://packages.cloud.google.com/yum/doc/yum-key.gpg - https://packages.cloud.google.com/yum/doc/rpm-package-key.gpg state: present - name: Asenna docker-ce -paketti tags: - packages yum: name: docker-ce state: present - name: Asenna NTP -paketti tags: - packages yum: name: ntp state: present - name: Asenna kubernetes -paketit tags: - packages yum: name: "{{ item }}" state: present loop: - kubelet - kubeadm - kubectl - name: Kaynnista palvelut tags: - services service: name={{ item }} state=started enabled=yes loop: - docker - ntpd - kubelet name: Alusta kubernetes masterit become: true become_method: sudo hosts: k8s-masters tags: - cluster tasks: - name: kubeadm reset shell: "kubeadm reset -f" - name: kubeadm init shell: "kubeadm init --token-ttl=0 --apiserver-advertise-address=10.0.0.101 --pod-network-cidr=20.0.0.0/8" # TODO register: kubeadm_out - set_fact: kubeadm_join: "{{ kubeadm_out.stdout_lines[-1] }}" when: kubeadm_out.stdout.find("kubeadm join") != -1 - debug: var: kubeadm_join - name: Aseta ymparistomuuttujat shell: > cp /etc/kubernetes/admin.conf /home/vagrant/ && chown vagrant:vagrant /home/vagrant/admin.conf && export KUBECONFIG=/home/vagrant/admin.conf && echo export KUBECONFIG=$KUBECONFIG >> /home/vagrant/.bashrc name: Konfiguroi CNI-verkko become: true become_method: sudo hosts: k8s-masters tags: - cluster-network tasks: - sysctl: name=net.bridge.bridge-nf-call-iptables value=1 state=present reload=yes sysctl_set=yes - sysctl: name=net.bridge.bridge-nf-call-ip6tables value=1 state=present reload=yes sysctl_set=yes - name: Asenna Flannel-plugin shell: > export KUBECONFIG=/home/vagrant/admin.conf ; kubectl apply -f https://raw.githubusercontent.com/coreos/flannel/master/Documentation/kube-flannel.yml - shell: sleep 10 name: Alusta kubernetes workerit become: true become_method: sudo hosts: k8s-workers tags: - cluster tasks: - name: kubeadm reset shell: "kubeadm reset -f" - name: kubeadm join tags: - cluster shell: "{{ hostvars['k8s-n1'].kubeadm_join }}" # TODO 这是完整的ansible日志 ansible-controller: Running ansible-playbook...cd /vagrant && PYTHONUNBUFFERED=1 ANSIBLE_NOCOLOR=true ANSIBLE_CONFIG='ansible/ansible.cfg' ansible-playbook --limit="all" --inventory-file=ansible/hosts -v ansible/site.ymlUsing /vagrant/ansible/ansible.cfg as config file/vagrant/ansible/hosts did not meet host_list requirements, check plugin documentation if this is unexpected/vagrant/ansible/hosts did not meet script requirements, check plugin documentation if this is unexpected PLAY [Asenna docker ja kubernetes] * TASK [Gathering Facts] *ok: [k8s-n1]ok: [k8s-n3]ok: [k8s-n2] TASK [common : Testaa] *changed: [k8s-n3] => {"changed": true, "checksum": "6920e1826e439962050ec0ab4221719b3a045f04", "dest": "/template.test", "gid": 0, "group": "root", "md5sum": "a4f61c365318c3e23d466914fbd02687", "mode": "0644", "owner": "root", "secontext": "system_u:object_r:etc_runtime_t:s0", "size": 14, "src": "/home/vagrant/.ansible/tmp/ansible-tmp-1546760756.54-124542112178019/source", "state": "file", "uid": 0}changed: [k8s-n2] => {"changed": true, "checksum": "6920e1826e439962050ec0ab4221719b3a045f04", "dest": "/template.test", "gid": 0, "group": "root", "md5sum": "a4f61c365318c3e23d466914fbd02687", "mode": "0644", "owner": "root", "secontext": "system_u:object_r:etc_runtime_t:s0", "size": 14, "src": "/home/vagrant/.ansible/tmp/ansible-tmp-1546760756.51-240329169302936/source", "state": "file", "uid": 0}changed: [k8s-n1] => {"changed": true, "checksum":

社区小助手

使用pyspark将csv文件转换为parquet文件:Py4JJavaError:调用o347.parquet时发生错误[duplicate]

我正在尝试将csv转换为Parquet。我使用python 3.6和spark 2.3.1 64位。我无法找到给定追溯的解决方案。我也使用64位python。 我有这个csv: Corp,Vathanya BeckCorp,Mario BazileOpen,Hasom Bennitt-trafletOpen,Jonathon BerryCorp,Ayinde AmezquitaCorp,Carol AiriofoloCorp,Wilfredo Brozo我可以使用pandas函数to_parquet使csv到parquet,但不知何故spark不能正常工作。在pandas中,我使用了pyarrow引擎进行转换。我使用以下spark代码将csv转换为Parquet: from pyspark import SparkContext, SparkConfconf = SparkConf()sc = SparkContext(conf=conf)from pyspark.sql import SQLContextsqlContext = SQLContext(sc)import pysparkfrom pyspark.sql.session import SparkSessionfrom pyspark.sql.types import StructType ,StructField,StringTypeSparkSession.builder.config(conf=conf).appName("BLEH").getOrCreate()schema = StructType([StructField('type', StringType(), True), StructField('name1', StringType(), True)]) df = sqlContext.read.csv('cv_transactions.csv',schema)df.show()以下是在spark数据帧中读取csv后的给定输出。 type name1 Corp Vathanya Beck Corp Mario Bazile Open Hasom Bennitt-tra... Open Jonathon Berry Corp Ayinde Amezquita Corp Carol Airiofolo Corp Wilfredo Brozo 但是当我尝试使用以下代码转换为parquet时: df.write.parquet('r.parquet')它给了我以下错误: Py4JJavaError: An error occurred while calling o347.parquet.: org.apache.spark.SparkException: Job aborted. at org.apache.spark.sql.execution.datasources.FileFormatWriter$.write(FileFormatWriter.scala:224) at org.apache.spark.sql.execution.datasources.InsertIntoHadoopFsRelationCommand.run(InsertIntoHadoopFsRelationCommand.scala:154) at org.apache.spark.sql.execution.command.DataWritingCommandExec.sideEffectResult$lzycompute(commands.scala:104) at org.apache.spark.sql.execution.command.DataWritingCommandExec.sideEffectResult(commands.scala:102) at org.apache.spark.sql.execution.command.DataWritingCommandExec.doExecute(commands.scala:122) at org.apache.spark.sql.execution.SparkPlan$$anonfun$execute$1.apply(SparkPlan.scala:131) at org.apache.spark.sql.execution.SparkPlan$$anonfun$execute$1.apply(SparkPlan.scala:127) at org.apache.spark.sql.execution.SparkPlan$$anonfun$executeQuery$1.apply(SparkPlan.scala:155) at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:151) at org.apache.spark.sql.execution.SparkPlan.executeQuery(SparkPlan.scala:152) at org.apache.spark.sql.execution.SparkPlan.execute(SparkPlan.scala:127) at org.apache.spark.sql.execution.QueryExecution.toRdd$lzycompute(QueryExecution.scala:80) at org.apache.spark.sql.execution.QueryExecution.toRdd(QueryExecution.scala:80) at org.apache.spark.sql.DataFrameWriter$$anonfun$runCommand$1.apply(DataFrameWriter.scala:654) at org.apache.spark.sql.DataFrameWriter$$anonfun$runCommand$1.apply(DataFrameWriter.scala:654) at org.apache.spark.sql.execution.SQLExecution$.withNewExecutionId(SQLExecution.scala:77) at org.apache.spark.sql.DataFrameWriter.runCommand(DataFrameWriter.scala:654) at org.apache.spark.sql.DataFrameWriter.saveToV1Source(DataFrameWriter.scala:273) at org.apache.spark.sql.DataFrameWriter.save(DataFrameWriter.scala:267) at org.apache.spark.sql.DataFrameWriter.save(DataFrameWriter.scala:225) at org.apache.spark.sql.DataFrameWriter.parquet(DataFrameWriter.scala:547) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:497) at py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:244) at py4j.reflection.ReflectionEngine.invoke(ReflectionEngine.java:357) at py4j.Gateway.invoke(Gateway.java:282) at py4j.commands.AbstractCommand.invokeMethod(AbstractCommand.java:132) at py4j.commands.CallCommand.execute(CallCommand.java:79) at py4j.GatewayConnection.run(GatewayConnection.java:238) at java.lang.Thread.run(Thread.java:745) Caused by: org.apache.spark.SparkException: Job aborted due to stage failure: Task 0 in stage 25.0 failed 1 times, most recent failure: Lost task 0.0 in stage 25.0 (TID 25, localhost, executor driver): org.apache.spark.SparkException: Task failed while writing rows. at org.apache.spark.sql.execution.datasources.FileFormatWriter$.org$apache$spark$sql$execution$datasources$FileFormatWriter$$executeTask(FileFormatWriter.scala:285) at org.apache.spark.sql.execution.datasources.FileFormatWriter$$anonfun$write$1.apply(FileFormatWriter.scala:197) at org.apache.spark.sql.execution.datasources.FileFormatWriter$$anonfun$write$1.apply(FileFormatWriter.scala:196) at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:87) at org.apache.spark.scheduler.Task.run(Task.scala:109) at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:345) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at java.lang.Thread.run(Thread.java:745) Caused by: java.io.IOException: (null) entry in command string: null chmod 0644 C:Usersrohan_pawarDocumentsparquetr_temporary0_temporaryattempt_20181214173824_0025_m_000000_0part-00000-e076c220-6226-4617-abf9-14e7f3a2ce81-c000.snappy.parquet at org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:770) at org.apache.hadoop.util.Shell.execCommand(Shell.java:866) at org.apache.hadoop.util.Shell.execCommand(Shell.java:849) at org.apache.hadoop.fs.RawLocalFileSystem.setPermission(RawLocalFileSystem.java:733) at org.apache.hadoop.fs.RawLocalFileSystem$LocalFSFileOutputStream.<init>(RawLocalFileSystem.java:225) at org.apache.hadoop.fs.RawLocalFileSystem$LocalFSFileOutputStream.<init>(RawLocalFileSystem.java:209) at org.apache.hadoop.fs.RawLocalFileSystem.createOutputStreamWithMode(RawLocalFileSystem.java:307) at org.apache.hadoop.fs.RawLocalFileSystem.create(RawLocalFileSystem.java:296) at org.apache.hadoop.fs.RawLocalFileSystem.create(RawLocalFileSystem.java:328) at org.apache.hadoop.fs.ChecksumFileSystem$ChecksumFSOutputSummer.<init>(ChecksumFileSystem.java:398) at org.apache.hadoop.fs.ChecksumFileSystem.create(ChecksumFileSystem.java:461) at org.apache.hadoop.fs.ChecksumFileSystem.create(ChecksumFileSystem.java:440) at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:911) at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:892) at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:789) at org.apache.parquet.hadoop.ParquetFileWriter.<init>(ParquetFileWriter.java:241) at org.apache.parquet.hadoop.ParquetOutputFormat.getRecordWriter(ParquetOutputFormat.java:342) at org.apache.parquet.hadoop.ParquetOutputFormat.getRecordWriter(ParquetOutputFormat.java:302) at org.apache.spark.sql.execution.datasources.parquet.ParquetOutputWriter.<init>(ParquetOutputWriter.scala:37) at org.apache.spark.sql.execution.datasources.parquet.ParquetFileFormat$$anon$1.newInstance(ParquetFileFormat.scala:151) at org.apache.spark.sql.execution.datasources.FileFormatWriter$SingleDirectoryWriteTask.newOutputWriter(FileFormatWriter.scala:367) at org.apache.spark.sql.execution.datasources.FileFormatWriter$SingleDirectoryWriteTask.execute(FileFormatWriter.scala:378) at org.apache.spark.sql.execution.datasources.FileFormatWriter$$anonfun$org$apache$spark$sql$execution$datasources$FileFormatWriter$$executeTask$3.apply(FileFormatWriter.scala:269) at org.apache.spark.sql.execution.datasources.FileFormatWriter$$anonfun$org$apache$spark$sql$execution$datasources$FileFormatWriter$$executeTask$3.apply(FileFormatWriter.scala:267) at org.apache.spark.util.Utils$.tryWithSafeFinallyAndFailureCallbacks(Utils.scala:1414) at org.apache.spark.sql.execution.datasources.FileFormatWriter$.org$apache$spark$sql$execution$datasources$FileFormatWriter$$executeTask(FileFormatWriter.scala:272) ... 8 more Driver stacktrace: at org.apache.spark.scheduler.DAGScheduler.org$apache$spark$scheduler$DAGScheduler$$failJobAndIndependentStages(DAGScheduler.scala:1602) at org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1590) at org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1589) at scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59) at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:48) at org.apache.spark.scheduler.DAGScheduler.abortStage(DAGScheduler.scala:1589) at org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:831) at org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:831) at scala.Option.foreach(Option.scala:257) at org.apache.spark.scheduler.DAGScheduler.handleTaskSetFailed(DAGScheduler.scala:831) at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.doOnReceive(DAGScheduler.scala:1823) at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:1772) at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:1761) at org.apache.spark.util.EventLoop$$anon$1.run(EventLoop.scala:48) at org.apache.spark.scheduler.DAGScheduler.runJob(DAGScheduler.scala:642) at org.apache.spark.SparkContext.runJob(SparkContext.scala:2034) at org.apache.spark.sql.execution.datasources.FileFormatWriter$.write(FileFormatWriter.scala:194) ... 31 more Caused by: org.apache.spark.SparkException: Task failed while writing rows. at org.apache.spark.sql.execution.datasources.FileFormatWriter$.org$apache$spark$sql$execution$datasources$FileFormatWriter$$executeTask(FileFormatWriter.scala:285) at org.apache.spark.sql.execution.datasources.FileFormatWriter$$anonfun$write$1.apply(FileFormatWriter.scala:197) at org.apache.spark.sql.execution.datasources.FileFormatWriter$$anonfun$write$1.apply(FileFormatWriter.scala:196) at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:87) at org.apache.spark.scheduler.Task.run(Task.scala:109) at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:345) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) ... 1 more Caused by: java.io.IOException: (null) entry in command string: null chmod 0644 C:Usersrohan_pawarDocumentsparquetr_temporary0_temporaryattempt_20181214173824_0025_m_000000_0part-00000-e076c220-6226-4617-abf9-14e7f3a2ce81-c000.snappy.parquet at org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:770) at org.apache.hadoop.util.Shell.execCommand(Shell.java:866) at org.apache.hadoop.util.Shell.execCommand(Shell.java:849) at org.apache.hadoop.fs.RawLocalFileSystem.setPermission(RawLocalFileSystem.java:733) at org.apache.hadoop.fs.RawLocalFileSystem$LocalFSFileOutputStream.<init>(RawLocalFileSystem.java:225) at org.apache.hadoop.fs.RawLocalFileSystem$LocalFSFileOutputStream.<init>(RawLocalFileSystem.java:209) at org.apache.hadoop.fs.RawLocalFileSystem.createOutputStreamWithMode(RawLocalFileSystem.java:307) at org.apache.hadoop.fs.RawLocalFileSystem.create(RawLocalFileSystem.java:296) at org.apache.hadoop.fs.RawLocalFileSystem.create(RawLocalFileSystem.java:328) at org.apache.hadoop.fs.ChecksumFileSystem$ChecksumFSOutputSummer.<init>(ChecksumFileSystem.java:398) at org.apache.hadoop.fs.ChecksumFileSystem.create(ChecksumFileSystem.java:461) at org.apache.hadoop.fs.ChecksumFileSystem.create(ChecksumFileSystem.java:440) at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:911) at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:892) at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:789) at org.apache.parquet.hadoop.ParquetFileWriter.<init>(ParquetFileWriter.java:241) at org.apache.parquet.hadoop.ParquetOutputFormat.getRecordWriter(ParquetOutputFormat.java:342) at org.apache.parquet.hadoop.ParquetOutputFormat.getRecordWriter(ParquetOutputFormat.java:302) at org.apache.spark.sql.execution.datasources.parquet.ParquetOutputWriter.<init>(ParquetOutputWriter.scala:37) at org.apache.spark.sql.execution.datasources.parquet.ParquetFileFormat$$anon$1.newInstance(ParquetFileFormat.scala:151) at org.apache.spark.sql.execution.datasources.FileFormatWriter$SingleDirectoryWriteTask.newOutputWriter(FileFormatWriter.scala:367) at org.apache.spark.sql.execution.datasources.FileFormatWriter$SingleDirectoryWriteTask.execute(FileFormatWriter.scala:378) at org.apache.spark.sql.execution.datasources.FileFormatWriter$$anonfun$org$apache$spark$sql$execution$datasources$FileFormatWriter$$executeTask$3.apply(FileFormatWriter.scala:269) at org.apache.spark.sql.execution.datasources.FileFormatWriter$$anonfun$org$apache$spark$sql$execution$datasources$FileFormatWriter$$executeTask$3.apply(FileFormatWriter.scala:267) at org.apache.spark.util.Utils$.tryWithSafeFinallyAndFailureCallbacks(Utils.scala:1414) at org.apache.spark.sql.execution.datasources.FileFormatWriter$.org$apache$spark$sql$execution$datasources$FileFormatWriter$$executeTask(FileFormatWriter.scala:272) ... 8 more

祝小祝

使用ansible修改/etc/shadow文件的acl权限失败

我使用ansible修改其他配置文件时都可以修改,在修改/etc/shadow文件时会报错:setfacl: /etc/shadow: Operation not permittednon-zero return codesetfacl: /etc/shadow: Operation not permitted请问怎样才能使用ansible来修改/etc/shadow文件的acl权限呢? 我使用shell模块 ansible test1 -m shell -a 'setfacl -m user:aiuap:r /etc/shadow'如下报错10.124.210.222 | FAILED | rc=1 >>setfacl: /etc/shadow: Operation not permittednon-zero return code 我使用acl模块 ansible test1-m acl -a 'path=/etc/shadow entity=test etype=user permissions=r state=present'如下报错10.124.210.222 | FAILED! => { "changed": false, "cmd": "/usr/bin/setfacl -m user:test:r /etc/shadow", "msg": "setfacl: /etc/shadow: Operation not permitted", "rc": 1, "stderr": "setfacl: /etc/shadow: Operation not permitted\n", "stderr_lines": [ "setfacl: /etc/shadow: Operation not permitted" ], "stdout": "", "stdout_lines": [] } 我使用script模块 ansible test1 -m script -a './acl.sh'提示成功,但是实际上并未修改/etc/shadow的文件acl权限.10.124.210.222 | SUCCESS => { "changed": true, "rc": 0, "stderr": "Shared connection to 10.124.210.222 closed.\r\n", "stderr_lines": [ "Shared connection to 10.124.210.222 closed." ], "stdout": "setfacl: /etc/shadow: Operation not permitted\r\ngetfacl: Removing leading '/' from absolute path namesrn# file: etc/shadowrn# owner: rootrn# group: rootrnuser::---rngroup::---rnother::---rnrn", "stdout_lines": [ "setfacl: /etc/shadow: Operation not permitted", "getfacl: Removing leading '/' from absolute path names", "# file: etc/shadow", "# owner: root", "# group: root", "user::---", "group::---", "other::---", "" ] }

焕兮

dataX 在crontab 定时调用中失败

使用 crontab 定时调用datax执行命令,仅输出DataX (DATAX-OPENSOURCE-3.0), From Alibaba !Copyright (C) 2010-2017, Alibaba Group. All Rights Reserved.没有后续的执行动作使用环境 ubuntu 18.04,python版本2.7执行命令 在shell 中执行无误

社区小助手

Spark-shell Git Bash

我无法使用spark-shell命令来处理Git bash。我很确定我的环境设置正确,好像我在命令提示符下的任何目录中运行spark-shell,它对我来说效果很好。 但是,当我在bash中运行spark-shell时,它会输出此值而不是运行实际的shell: "C:Program FilesJavajdk1.8.0_191binjava" -cp "C:sparkspark-2.4.0-bin- hadoop2.7/conf;C:sparkspark-2.4.0-bin-hadoop2.7jars*" "- Dscala.usejavacp=true"-Xmx1g org.apache.spark.deploy.SparkSubmit --class org.apache.spark.repl.Main --name "Spark shell" spark-shell

社区小助手

将数据集<行>导出为CSV

我正在尝试根据一些Spark SQL结果生成CSV文件。 我试图填补所有空值,但徒劳无功。似乎与我正在配置它的方式有关。 这是我正在运行的代码。 SparkSession spark = SparkSession.builder().appName("Workshop").master("local[*]").getOrCreate();SQLContext sqlContext= new SQLContext(spark); Dataset customers = spark.read().option("header", "true").csv(pathToCustomers);Dataset unsubscribed = spark.read().option("header", "true").csv(pathToUnsubscribed);Dataset cleaned = spark.read().option("header", "true").csv(pathToCleaned); sqlContext.registerDataFrameAsTable(customers, "customers");sqlContext.registerDataFrameAsTable(unsubscribed, "unsubscribed"); sqlContext.registerDataFrameAsTable(cleaned, "cleaned"); //Run the query then the splitDataset deleteUnsubscribed = sqlContext.sql("select * from customers where Email not in (select Email_Address from unsubscribed)"); sqlContext.registerDataFrameAsTable(deleteUnsubscribed, "deleteUnsubscribed"); Dataset deleteCleaned = sqlContext.sql("select * from deleteUnsubscribed where Email not in (select Email_Address from cleaned)"); deleteCleaned.write().option("sep", ";").option("header", "true").csv("Data/customers.csv");这会产生以下错误 Exception in thread "main" org.apache.spark.SparkException: Job aborted.at org.apache.spark.sql.execution.datasources.FileFormatWriter$$anonfun$write$1.apply$mcV$sp(FileFormatWriter.scala:147)at org.apache.spark.sql.execution.datasources.FileFormatWriter$$anonfun$write$1.apply(FileFormatWriter.scala:121)at org.apache.spark.sql.execution.datasources.FileFormatWriter$$anonfun$write$1.apply(FileFormatWriter.scala:121)at org.apache.spark.sql.execution.SQLExecution$.withNewExecutionId(SQLExecution.scala:57)at org.apache.spark.sql.execution.datasources.FileFormatWriter$.write(FileFormatWriter.scala:121)at org.apache.spark.sql.execution.datasources.InsertIntoHadoopFsRelationCommand.run(InsertIntoHadoopFsRelationCommand.scala:101)at org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult$lzycompute(commands.scala:58)at org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult(commands.scala:56)at org.apache.spark.sql.execution.command.ExecutedCommandExec.doExecute(commands.scala:74)at org.apache.spark.sql.execution.SparkPlan$$anonfun$execute$1.apply(SparkPlan.scala:114)at org.apache.spark.sql.execution.SparkPlan$$anonfun$execute$1.apply(SparkPlan.scala:114)at org.apache.spark.sql.execution.SparkPlan$$anonfun$executeQuery$1.apply(SparkPlan.scala:135)at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:151)at org.apache.spark.sql.execution.SparkPlan.executeQuery(SparkPlan.scala:132)at org.apache.spark.sql.execution.SparkPlan.execute(SparkPlan.scala:113)at org.apache.spark.sql.execution.QueryExecution.toRdd$lzycompute(QueryExecution.scala:87)at org.apache.spark.sql.execution.QueryExecution.toRdd(QueryExecution.scala:87)at org.apache.spark.sql.execution.datasources.DataSource.write(DataSource.scala:492)at org.apache.spark.sql.DataFrameWriter.save(DataFrameWriter.scala:215)at org.apache.spark.sql.DataFrameWriter.save(DataFrameWriter.scala:198)at org.apache.spark.sql.DataFrameWriter.csv(DataFrameWriter.scala:579)at com.example.demo.DemoApplication.deleteCleanedAndUnsubscribedFromCustomers(DemoApplication.java:114)at com.example.demo.DemoApplication.main(DemoApplication.java:124) Caused by: org.apache.spark.SparkException: Job aborted due to stage failure: Task 0 in stage 8.0 failed 1 times, most recent failure: Lost task 0.0 in stage 8.0 (TID 8, localhost, executor driver): java.lang.NullPointerExceptionat java.lang.ProcessBuilder.start(Unknown Source)at org.apache.hadoop.util.Shell.runCommand(Shell.java:404)at org.apache.hadoop.util.Shell.run(Shell.java:379)at org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:589)at org.apache.hadoop.util.Shell.execCommand(Shell.java:678)at org.apache.hadoop.util.Shell.execCommand(Shell.java:661)at org.apache.hadoop.fs.RawLocalFileSystem.setPermission(RawLocalFileSystem.java:639)at org.apache.hadoop.fs.FilterFileSystem.setPermission(FilterFileSystem.java:468)at org.apache.hadoop.fs.ChecksumFileSystem.create(ChecksumFileSystem.java:456)at org.apache.hadoop.fs.ChecksumFileSystem.create(ChecksumFileSystem.java:424)at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:905)at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:886)at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:783)at org.apache.hadoop.mapreduce.lib.output.TextOutputFormat.getRecordWriter(TextOutputFormat.java:132)at org.apache.spark.sql.execution.datasources.csv.CsvOutputWriter.(CSVRelation.scala:208)at org.apache.spark.sql.execution.datasources.csv.CSVOutputWriterFactory.newInstance(CSVRelation.scala:178)at org.apache.spark.sql.execution.datasources.FileFormatWriter$SingleDirectoryWriteTask.(FileFormatWriter.scala:234)at org.apache.spark.sql.execution.datasources.FileFormatWriter$.org$apache$spark$sql$execution$datasources$FileFormatWriter$$executeTask(FileFormatWriter.scala:182)at org.apache.spark.sql.execution.datasources.FileFormatWriter$$anonfun$write$1$$anonfun$3.apply(FileFormatWriter.scala:129)at org.apache.spark.sql.execution.datasources.FileFormatWriter$$anonfun$write$1$$anonfun$3.apply(FileFormatWriter.scala:128)at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:87)at org.apache.spark.scheduler.Task.run(Task.scala:99)at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:282)at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source)at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)at java.lang.Thread.run(Unknown Source) Driver stacktrace:at org.apache.spark.scheduler.DAGScheduler.org$apache$spark$scheduler$DAGScheduler$$failJobAndIndependentStages(DAGScheduler.scala:1435)at org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1423)at org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1422)at scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59)at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:48)at org.apache.spark.scheduler.DAGScheduler.abortStage(DAGScheduler.scala:1422)at org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:802)at org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:802)at scala.Option.foreach(Option.scala:257)at org.apache.spark.scheduler.DAGScheduler.handleTaskSetFailed(DAGScheduler.scala:802)at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.doOnReceive(DAGScheduler.scala:1650)at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:1605)at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:1594)at org.apache.spark.util.EventLoop$$anon$1.run(EventLoop.scala:48)at org.apache.spark.scheduler.DAGScheduler.runJob(DAGScheduler.scala:628)at org.apache.spark.SparkContext.runJob(SparkContext.scala:1918)at org.apache.spark.SparkContext.runJob(SparkContext.scala:1931)at org.apache.spark.SparkContext.runJob(SparkContext.scala:1951)at org.apache.spark.sql.execution.datasources.FileFormatWriter$$anonfun$write$1.apply$mcV$sp(FileFormatWriter.scala:127)... 22 more Caused by: java.lang.NullPointerExceptionat java.lang.ProcessBuilder.start(Unknown Source)at org.apache.hadoop.util.Shell.runCommand(Shell.java:404)at org.apache.hadoop.util.Shell.run(Shell.java:379)at org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:589)at org.apache.hadoop.util.Shell.execCommand(Shell.java:678)at org.apache.hadoop.util.Shell.execCommand(Shell.java:661)at org.apache.hadoop.fs.RawLocalFileSystem.setPermission(RawLocalFileSystem.java:639)at org.apache.hadoop.fs.FilterFileSystem.setPermission(FilterFileSystem.java:468)at org.apache.hadoop.fs.ChecksumFileSystem.create(ChecksumFileSystem.java:456)at org.apache.hadoop.fs.ChecksumFileSystem.create(ChecksumFileSystem.java:424)at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:905)at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:886)at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:783)at org.apache.hadoop.mapreduce.lib.output.TextOutputFormat.getRecordWriter(TextOutputFormat.java:132)at org.apache.spark.sql.execution.datasources.csv.CsvOutputWriter.(CSVRelation.scala:208)at org.apache.spark.sql.execution.datasources.csv.CSVOutputWriterFactory.newInstance(CSVRelation.scala:178)at org.apache.spark.sql.execution.datasources.FileFormatWriter$SingleDirectoryWriteTask.(FileFormatWriter.scala:234)at org.apache.spark.sql.execution.datasources.FileFormatWriter$.org$apache$spark$sql$execution$datasources$FileFormatWriter$$executeTask(FileFormatWriter.scala:182)at org.apache.spark.sql.execution.datasources.FileFormatWriter$$anonfun$write$1$$anonfun$3.apply(FileFormatWriter.scala:129)at org.apache.spark.sql.execution.datasources.FileFormatWriter$$anonfun$write$1$$anonfun$3.apply(FileFormatWriter.scala:128)at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:87)at org.apache.spark.scheduler.Task.run(Task.scala:99)at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:282)at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source)at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)at java.lang.Thread.run(Unknown Source0)我尝试使用较小的数据集(此数据集包含57548行)但错误相同。 这是pom文件 &lt;?xml version="1.0" encoding="UTF-8"?&gt;xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd"&gt;4.0.0 com.exampledemo0.0.1-SNAPSHOTjar demoDemo project for Spring Boot &lt;groupId&gt;org.springframework.boot&lt;/groupId&gt; &lt;artifactId&gt;spring-boot-starter-parent&lt;/artifactId&gt; &lt;version&gt;2.1.0.RELEASE&lt;/version&gt; &lt;relativePath/&gt; &lt;!-- lookup parent from repository --&gt; &lt;project.build.sourceEncoding&gt;UTF-8&lt;/project.build.sourceEncoding&gt; &lt;project.reporting.outputEncoding&gt;UTF-8&lt;/project.reporting.outputEncoding&gt; &lt;java.version&gt;1.8&lt;/java.version&gt; &lt;dependency&gt; &lt;groupId&gt;org.springframework.boot&lt;/groupId&gt; &lt;artifactId&gt;spring-boot-starter&lt;/artifactId&gt; &lt;/dependency&gt; &lt;dependency&gt; &lt;groupId&gt;org.springframework.boot&lt;/groupId&gt; &lt;artifactId&gt;spring-boot-starter-test&lt;/artifactId&gt; &lt;scope&gt;test&lt;/scope&gt; &lt;/dependency&gt; &lt;dependency&gt; &lt;groupId&gt;org.apache.spark&lt;/groupId&gt; &lt;artifactId&gt;spark-core_2.11&lt;/artifactId&gt; &lt;version&gt;2.1.0&lt;/version&gt; &lt;/dependency&gt; &lt;dependency&gt; &lt;groupId&gt;org.apache.spark&lt;/groupId&gt; &lt;artifactId&gt;spark-sql_2.11&lt;/artifactId&gt; &lt;version&gt;2.1.0&lt;/version&gt; &lt;exclusions&gt; &lt;exclusion&gt; &lt;groupId&gt;org.codehaus.janino&lt;/groupId&gt; &lt;artifactId&gt;janino&lt;/artifactId&gt; &lt;/exclusion&gt; &lt;exclusion&gt; &lt;groupId&gt;org.codehaus.janino&lt;/groupId&gt; &lt;artifactId&gt;commons-compiler&lt;/artifactId&gt; &lt;/exclusion&gt; &lt;/exclusions&gt; &lt;/dependency&gt; &lt;dependency&gt; &lt;groupId&gt;org.codehaus.janino&lt;/groupId&gt; &lt;artifactId&gt;commons-compiler&lt;/artifactId&gt; &lt;version&gt;3.0.6&lt;/version&gt; &lt;/dependency&gt; &lt;dependency&gt; &lt;groupId&gt;org.codehaus.janino&lt;/groupId&gt; &lt;artifactId&gt;janino&lt;/artifactId&gt; &lt;/dependency&gt; &lt;plugins&gt; &lt;plugin&gt; &lt;groupId&gt;org.springframework.boot&lt;/groupId&gt; &lt;artifactId&gt;spring-boot-maven-plugin&lt;/artifactId&gt; &lt;/plugin&gt; &lt;/plugins&gt;