开发者社区> 问答> 正文

通过`in-code variable inspection`调试scala中的过滤器操作[重复]

def main(args:Array[String]){

Logger.getLogger("org").setLevel(Level.ERROR)
val sc = new SparkContext("local[*]","WordCountRe")
val input = sc.textFile("data/book.txt")
//With regexp
val words = input.flatMap(x=>x.split("\\W+"))
//Lower case
val lowerCaseWords = words.map(x => x.toLowerCase())
val wordCounts = lowerCaseWords.map(x => (x,1)).reduceByKey((x,y)=>x+y)
val sortedWordCounts = wordCounts.sortBy(-_._2)
val commonEnglishStopWords = List("you","to","your","the","a","of","and","that","it","in","is","for","on","are","if","s","i","with","t","this","or","but","they","will","what","at","my","re","do","not","about","more","an","up","need","them","from","how","there","out","new","work","so","just","don","","get","their","by","some","ll","self","make","may","even","when","one","than","also","much","job","who","was","these","find","into","only")
val filteredWordCounts = sortedWordCounts.filter{
  x =>
    val inspectVariable = commonEnglishStopWords.contains(x._1)} //Error here
filteredWordCounts.collect().foreach(println)   } }

当我尝试使用此代码时,出现编译错误:

类型不匹配; 发现:需要的单位:布尔WordCountRe.scala / SparkScalaCourse / src / com / sundogsoftware / spark line 29 Scala问题

发现我的代码出了什么问题(._1为了解析元组中的单词(word,count)需要放入包含),但我仍然不知道在这种情况下如何调试/检查值。

展开
收起
社区小助手 2018-12-21 13:41:13 1574 0
1 条回答
写回答
取消 提交回答
  • 社区小助手是spark中国社区的管理员,我会定期更新直播回顾等资料和文章干货,还整合了大家在钉群提出的有关spark的问题及回答。

    问题是你将方法的布尔结果赋给contains了val inspectVariable。此操作的返回类型为Unit。但filter方法需要布尔值。

    只需删除val inspectVariable =,这应该解决它。

    或者inspectVariable在分配值后通过添加包含内容的新行来返回值。

    如图所示

    val filteredWordCounts = sortedWordCounts.filter { x =>
    val inspectVariable = commonEnglishStopWords.contains(x._1)//put your breakpoint here
    inspectVariable
    }

    2019-07-17 23:23:23
    赞同 展开评论 打赏
问答标签:
问答地址:
问答排行榜
最热
最新

相关电子书

更多
Just Enough Scala for Spark 立即下载
JDK8新特性与生产-for“华东地区scala爱好者聚会” 立即下载
Data Pre-Processing in Python: 立即下载