Bootstrap

Scala模式匹配、类型系统笔记

大数据处理首选Scala语言,Java语言在Java8之前无数据处理能力,无函数式编程。

一、模式匹配
类似于Java中的switch case,switch case对传入的值进行匹配,而Scala中的模式匹配可对类型、集合等进行匹配,功能更强大。

//一个简单的值匹配例子:
//函数为Unit类型,因为函数返回println()
scala> def bigData(data:String){
     | data match{
     | case "Spark" => println("wow")
     | case "Hadoop"=> println("ok")
     | case _ => println("Something Others ")  //_表示不满足上面匹配条件的所有其他情况
     | }
     | }
bigData: (data: String)Unit

scala> bigData("Hadoop")
ok

scala> bigData("HBase")
Something Others

//匹配时对_其他情况做条件判断,即守卫条件
scala> def bigData(data:String){
    data match{
    case "Spark" => println("wow")
    case "Hadoop"=> println("ok")
    case _ if data == "HBase" => println("Cool:"+data)
    case _ => println("Something Others ")
    }
}
bigData: (data: String)Unit

scala> bigData("HBase")
Cool:HBase

//类型匹配
scala> import java.io._
import java.io._
scala> def exception(e: Exception){
    e match{
        case fileException:FileNotFoundException => println("File not found:" + fileException)
        case _: Exception => println("Other exception")
        }
    }
exception: (e: Exception)Unit
scala> exception(new FileNotFoundException("Oops"))
File not found:java.io.FileNotFoundException: Oops

//对集合(以Array类型为例,List,Set类似)进行模式匹配
scala> def data(array:Array[String]){
     | array match{
     | case Array("Scala") => println("Scala")  //集合中仅有Scala一个元素
     | case Array(spark, hadoop, hbase) => println("spark:" + spark +" hadoop:" + hadoop + " hbase:"+ hbase)  //集合中有三个元素,匹配依次赋值给spark,hadoop,hbase
     | case Array("Spark", _*) => println("Spark ...") //集合以Saprk开头,_*表示后面的元素,不参加匹配
     | case _ => println("Unknow")
     | }
     | }
data: (array: Array[String])Unit

scala> data(Array("Scala"))
Scala

scala> data(Array("Spark"))
Spark ...

scala> data(Array("wow","ok","hhha"))
spark:wow hadoop:ok hbase:hhha

//case class样例类,默认成员只读(val,只有getter()方法)
//会自动生成伴生对象,伴生对象中有apply和unapply方法,这使得我们在使用时可以不显式的new对象
scala> case class Person(name:String)
defined class Person
scala> Person("Spark")  //"Spark"传给case class Person默认生成的伴生对象的apply()方法
res8: Person = Person(Spark) //返回case class Person的实例

//定义Person类
scala> class Person
defined class Person
warning: previously defined object Person is not a companion to class Person.
Companions must be defined together; you may wish to use :paste mode for this.

//定义两个case class
scala> case class Worker(name:String,salart:Double) extends Person
defined class Worker

scala> case class  Student(name:String, score:Double) extends Person
defined class Student

//case class模式匹配
scala> def sayHi(person:Person){
     | person match{
     | case Student(name,score) => println("Student: " + name + " " + score)
     | case Worker(name,salary) =>println("Worker: " + name + " " + salary)
     | case _ => println("Unknow")
     | }
     | }
sayHi: (person: Person)Unit

scala> sayHi(Worker("Spark",6.5))
Worker: Spark 6.5

scala> sayHi(Student("Hadoop",6.0))
Student: Hadoop 6.0

//多个参数的case class
scala> case class WorkClass(persons:Person*)  //WorkClass接受多个Person类型参数的类
defined class WorkClass

scala> def sayHi(){
     |      val person = WorkClass(Worker("Spark",6.6),Student("Hadoop",6.5))
     |      person match{
     |      case WorkClass(_,Student(name, score) ) => println("Student: " + name + " " + score)
     |      case _ => println("Unknow")
     |      }
     |      }
sayHi: ()Unit

scala> sayHi()
Student: Hadoop 6.5

//元组匹配
scala> val t=("spark","hive","SparkSQL")
t: (String, String, String) = (spark,hive,SparkSQL)

scala>       def tuplePattern(t:Any)=t match {
     |         case (one,_,_) => one   //对应位置"Spark"赋值给one
     |         case _ => "Other"
     |       }
tuplePattern: (t: Any)Any

scala> tuplePattern(t)
res3: Any = spark

//利用模式匹配按规定格式输出Map
scala> def pipei(){
     |     val m=Map("china"->"beijing","dwarf japan"->"tokyo","Aerican"->"DC Washington")
     |     for((nation,capital)<-m)
     |       println(nation+": " +capital)
     |   }
pipei: ()Unit

scala> pipei()
china: beijing
dwarf japan: tokyo
Aerican: DC Washington

在模式匹配中,有时为了确保所有的可能情况都被列出,将case class的超类定义为sealed(密封的) case class,上例Person则定义sealed class Person,此时在匹配中必须将所有可能出现的情况全部列出。
Option类型模式匹配:
Spark源码中模式匹配case class Some和case object None(继承于Option类)。
case class会自动生成伴生对象,含apply和unapply方法,而case object则无。

//Some,None用于模式匹配
scala> def OptionDemo(t:String){
     |      val p=Map("spark"->2,"hadoop"->3,"hbase"->4)
     |      p.get(t) match{
     |      case Some(x) => println(x)
     |      case _None=> println("None")
     |      }
     |      }
OptionDemo: (t: String)Unit

scala>  OptionDemo("Spark")
None

scala>  OptionDemo("spark")
2

二、类型系统
泛型:泛型类和泛型方法的参数类型在实际使用时具体指定。
类class和特质trait可以带泛型。对象不能泛型化。

//定义泛型类
scala> class Person[T](val content:T){  //泛型类,scala会从构造函数推断具体类型
     | def getContent(id: T) = id + "_" + content //泛型函数
     | }
defined class Person

scala> val p = new Person[String]("Spark")  //指定为String类型
p: Person[String] = Person@8c11eee

scala> p.getContent("Scala")  //此时getContent必须传入String类型
res1: String = Scala_Spark

scala> p.getContent(666)
<console>:10: error: type mismatch;
 found   : Int(666)
 required: String
              p.getContent(666)
//多个泛型参数
scala> class Person[T,S](val name:T, val age:S){
     | def get() = name + "_" + age
     | }
defined class Person

scala> val p = new  Person[String,Int]("Spark",8)
p: Person[String,Int] = Person@661fe025

scala> p.get
res6: String = Spark_8

scala> def mid[T](a:Array[T]) = a(a.length/2)  //泛型函数
mid: [T](a: Array[T])T

scala> mid(Array("hadoop","spark","hbase"))
res0: String = spark

scala> mid(Array(1,2,3))
res1: Int = 2

类型变量界定(Type Variable Bound):
若我们在泛型类中使用了某个方法,而该方法并非所有类型中都存在,此时若不进行类型变量界定,则编译不通过,因为无法判断后续将要指定的具体类型。我们可以使用类型变量界定(<:)将泛型T限定在某个存在该方法的类或接口的继承层次结构中,达到要求。

下边界 Lower Bound(>:) 上边界 Upper Bound(<:)
下边界指定泛型类型必须为某个类的父类或该类本身;
上边界指定泛型类型必须为某个类的子类或该类本身。
使用上边界时,实际调用时并非调用子类,而是调用抽象类或接口。

//T <: AnyVal表示泛型T的类型的最顶层类是AnyVal,所有输入是AnyVal的子类都是合法的,其它的都是非法的
case class Student[S,T <: AnyVal](var name:S,var hight:T)
scala> class Pair[T](val first: T, val second: T){  //错误,T类型并不一定有compareTo方法
     | def smaller = if (first.compareTo(second)<0) first else second
     | }
<console>:8: error: value compareTo is not a member of type parameter T
       def smaller = if (first.compareTo(second)<0) first else second

scala> class Pair[T <: Comparable[T]](val first: T, val second: T){  //指定上界
     | def smaller = if (first.compareTo(second)<0) first else second
     | }
defined class Pair

scala> val p = new Pair("hadoop", "Spark")
p: Pair[String] = Pair@6de30571

scala> p.smaller
res2: String = Spark

scala> val h = new Pair(1, 2)  //错误,不满足上界条件,用视图界定解决这个问题
<console>:8: error: inferred type arguments [Int] do not conform to class Pair's type parameter bounds [T <: Comparable[T]]
       val h = new Pair(1, 2)
               ^
<console>:8: error: type mismatch;
 found   : Int(1)
 required: T
       val h = new Pair(1, 2)
                        ^
<console>:8: error: type mismatch;
 found   : Int(2)
 required: T
       val h = new Pair(1, 2)

ViewBound 视图界定 <%
在进行类型变量界定后,若我们指定的具体类型无实现类中方法的接口,此时编译不通过。采用ViewBound,可跨越类层次结构,对指定类型进行隐式转换,隐式转换后的类型如果处于视图边界内,即可实现类中方法,则可将该类型传入使用。

scala> class Pair[T <% Comparable[T]](val first: T, val second: T){  //视图界定
     | def smaller = if (first.compareTo(second)<0) first else second
     | }
defined class Pair

scala> val h = new Pair(1, 2)  //Int隐式转换成为RichInt
h: Pair[Int] = Pair@50d951e7

scala> h.smaller
res3: Int = 1
//Context Bound上下文界定T:M(M是另一个泛型类),要求存在一个M[T]的隐式值
scala> class Compare[T:Ordering](val n1:T, val n2:T){
        //隐式值ordered,类型为Ordering[T]
     |  def bigger(implicit ordered: Ordering[T]) = if(ordered.compare(n1,n2)>0) n1 else n2}
defined class Compare

scala> new Compare[Int](8, 3).bigger
res3: Int = 8

scala> new Compare[String]("Spark", "Hadoop").bigger
res4: String = Spark

多重界定:
类型变量同时有上界和下界T >:Lower <: Upper
有多个视图界定T <% Comparable[T] <%String

trait List[+T] {} 协变
trait List[-T] {} 逆变
当类型S是类型A的子类型时,则List[S]也可以认为是List[A}的子类型,称为协变。
Java中无协变逆变,协变逆变会破环类型安全。
定义协变时,类和类中方法都要定义为协变。
当类型S是类型A的子类型,则Queue[A]反过来可以认为是Queue[S}的子类型,称为逆变。
例如:
Pair[T]
若Student是Person的子类,此时Pair[Student]和Pair[Person]无关系。
Pair[+T] 协变
若Student是Person的子类,此时Pair[Student]是Pair[Person]的子类。
Pair[-T] 协变
若Student是Person的子类,此时Pair[Student]是Pair[Person]的父类。

;