me icon indicating copy to clipboard operation
me copied to clipboard

Swift & C (Part 2: Unsafe Swift and Data)

Open nonocast opened this issue 2 years ago • 0 comments

Version: Swift 5.6, macOS 12.3, Xcode 11

Unsafe Swift - WWDC20

Benefits of unsafe interfaces:

  • Interoperability with code written in C or Objective-C
  • Control over runtime performance

观点:

  • Safe code not equal no crashes, safe API通过引起一个fatal error来保证停止执行,不会让错误继续下去。
  • Safe operations have well-defined behavior on all input
  • Unsafe operations have undefined behavior on some input
  • 比如x!, 假设x为nil,程序会终止,这是一个预先的设计

Unsafe Pointers

  • 等价C的指针
let ptr = UnsafeMutablePointer<Int>.allocate(capacity: 1)
ptr.initialize(to: 42)
print(ptr.pointee)
ptr.deallocate()
ptr.pointee = 23 // dangling pointer
  • Unsafe出来的内存需要开发人员自行负责,swift就不管了

对应关系:

  • const Pointee *: UnsafePointer<Pointee>
  • Pointee *: UnsafeMutablePointer<Pointee>
  • const void *, opaque pointers: UnsafeRawPointer
  • void *, opaque pointers: UnsafeMutablePointer

举例:

// C: `void process_integers(const int *start, size_t count);

// Swift: func process_integers(_ start: UnsafePointer<CInt>!, _ count: Int)

// client

let start = UnsafeMutablePointer<CInt>.allocate(capacity: 4) // 创建动态缓冲区

start.initialize(to: 0)
(start + 1).initialize(to: 2)
(start + 2).initialize(to: 4)
(start + 3).initialize(to: 6)

process_integers(start, 4)

// release
start.deinitialize(count: 4)
start.deallocate()

Buffer Pointer

Buffer Pointer = Pointer + capacity

  • UnsafeBufferPointer<Element>
  • UnsafeMutableBufferPointer<Element>
  • UnsafeRawBufferPointer
  • UnsafeMutableRawBufferPointer

Temporary pointers to Swift values

Generated pointer is only valid for the duration of the closure’s execution

  • withUnsafePointer(to:_:)
  • withUnsafeMutablePointer(to:_:)
  • withUnsafeBytes(of:_:)
  • withUnsafeMutableBytes(of:_:)

// C: void process_integers(const int *start, size_t count);

// Swift:

let values: [CInt] = [0, 2, 4, 6]

values.withUnsafeBufferPointer { buffer in
  print_integers(buffer.baseAddress!, buffer.count)
}

就是因为需要频繁传递buffer指针给C,所以swift专门设计了特殊语法,👍

  • const T *: UnsafePointer<T>: [T]
  • T *: UnsafeMutablePointer<T>: inout [T]
  • const char *: UnsafePointer<CChar>: String
  • T *: UnsafeMutablePointer<T>: inout T

// C:

int sysctl(int *name, u_int namelen,
  void *oldp, size_t *oldlenp,
  void *newp, size_t newlen)

// Swift:

func sysctl(
  _ name: UnsafeMutablePointer<CInt>!,
  _ namelen: CUnsignedInt,
  _ oldp: UnsafeMutableRawPointer!,
  _ oldlenp: UnsafeMutablePointer<Int>!,
  _ newp: UnsafeMutableRawPointer!,
  _ newlen: Int
) -> CInt

隐式转换来了,兄弟

func cachelineSize() -> Int {
  var query = [CTL_HW, HW_CACHELINE]
  var result: CInt = 0
  var resultSize = MemoryLayout<Cint>.size
  let r = sysctl(&query, CUnsignedInt(query.count), &result, &resultSize, nil, 0)
  precondition(r == 0, “Cannot query cache line size”)
  precondition(query.count == MemoryLayout<CInt>.size)
  return Int(result)
}

print(cachelineSize()) // 64

再来看一个String的例子:

func kernelVersion() -> String {
  var query = [CTL_KERN, KERN_VERSION]
  var length = 0
  let r = sysctl(&query, 2, nil, &length, nil, 0)
  precondition(r == 0, “Error retrieving kern.version”)
  return String(unsafeUninitialziedCapacity: length) { buffer in
    var length = buffer.count
    let r = sysctl(&query, 2, buffer.baseAddress, &length, nil, 0)
    precondition(r == 0, “Error retrieving kern.version”)
    precondition(length > 0 && length <= buffer.count)
    precondition(buffer[length - 1] == 0)
    return length - 1 // remove last \0
}

Unsafe Swift | raywenderlich

MemoryLayout

The memory layout of a type, describing its size, stride, and alignment

@frozen enum MemoryLayout<T>

  • static var size: Int - The contiguous memory footprint of T, in bytes.
  • static var alignment: Int - The default memory alignment of T, in bytes.
  • static var stride: Int - The number of bytes from the start of one instance of T to the start of the next when stored in contiguous memory or in an Array<T>.

可以理解为Swift下的sizeof, 对内存理解的抽象。


MemoryLayout<Int>.size          // returns 8 (on 64-bit)
MemoryLayout<Int>.alignment     // returns 8 (on 64-bit)
MemoryLayout<Int>.stride        // returns 8 (on 64-bit)

MemoryLayout<Int16>.size        // returns 2
MemoryLayout<Int16>.alignment   // returns 2
MemoryLayout<Int16>.stride      // returns 2

MemoryLayout<Bool>.size         // returns 1
MemoryLayout<Bool>.alignment    // returns 1
MemoryLayout<Bool>.stride       // returns 1

MemoryLayout<Float>.size        // returns 4
MemoryLayout<Float>.alignment   // returns 4
MemoryLayout<Float>.stride      // returns 4

MemoryLayout<Double>.size       // returns 8
MemoryLayout<Double>.alignment  // returns 8
MemoryLayout<Double>.stride     // returns 8

MemoryLayout<String>.size
MemoryLayout<String>.alignment
MemoryLayout<String>.stride

struct EmptyStruct { }
MemoryLayout<EmptyStruct>.size      // returns 0
MemoryLayout<EmptyStruct>.alignment // returns 1
MemoryLayout<EmptyStruct>.stride    // returns 1

struct SampleStruct {
  let number: UInt32
  let flag: Bool
}

MemoryLayout<SampleStruct>.size       // returns 5
MemoryLayout<SampleStruct>.alignment  // returns 4
MemoryLayout<SampleStruct>.stride     // returns 8

class EmptyClass {}

MemoryLayout<EmptyClass>.size      // returns 8 (on 64-bit)
MemoryLayout<EmptyClass>.stride    // returns 8 (on 64-bit)
MemoryLayout<EmptyClass>.alignment // returns 8 (on 64-bit)

class SampleClass {
  let number: Int64 = 0
  let flag = false
}

MemoryLayout<SampleClass>.size      // returns 8 (on 64-bit)
MemoryLayout<SampleClass>.stride    // returns 8 (on 64-bit)
MemoryLayout<SampleClass>.alignment // returns 8 (on 64-bit)

从上述的描述也能反应Struct是value object, Class是reference object.

Pointer

A pointer encapsulates a memory address.

Java, Javascript等高级语言一般工作在一个Safe的模式下,原则上是无法获取内存地址,Swift本身就是编译型语言,而且也必须兼容C/ObjC,所以Unsafe是必不可少,所以Swift提供了一组UnsafePointer来做C对应。

选择UnsafePointer需要做3个选择:

  • mutable or immutable
  • raw of typed
  • buffer style or not
// 1
let count = 2
let stride = MemoryLayout<Int>.stride
let alignment = MemoryLayout<Int>.alignment
let byteCount = stride * count

// 2
do {
  print("Raw pointers")
  
  // 3
  let pointer = UnsafeMutableRawPointer.allocate(
    byteCount: byteCount,
    alignment: alignment)
  // 4
  defer {
    pointer.deallocate()
  }
  
  // 5
  pointer.storeBytes(of: 42, as: Int.self)
  pointer.advanced(by: stride).storeBytes(of: 6, as: Int.self)
  pointer.load(as: Int.self)
  pointer.advanced(by: stride).load(as: Int.self)
  
  // 6
  let bufferPointer = UnsafeRawBufferPointer(start: pointer, count: byteCount)
  for (index, byte) in bufferPointer.enumerated() {
    print("byte \(index): \(byte)")
  }
}
  • 这个就等同于uint32_t *ptr = malloc(sizeof(uint32_t)*2);
  • 没有T就只能或者直接操作byte

加上Type

let count = 2

do {
  print("Typed pointers")
  
  let pointer = UnsafeMutablePointer<Int>.allocate(capacity: count)
  pointer.initialize(repeating: 0, count: count)
  defer {
    pointer.deinitialize(count: count)
    pointer.deallocate()
  }
  
  pointer.pointee = 42
  pointer.advanced(by: 1).pointee = 6
  pointer.pointee
  pointer.advanced(by: 1).pointee
  
  let bufferPointer = UnsafeBufferPointer(start: pointer, count: count)
  for (index, value) in bufferPointer.enumerated() {
    print("value \(index): \(value)")
  }
}
  • 有了类型,就可以根据T进行get和set

Unsafe memory pointers in Swift - The.Swift.Dev.

#include <stdio.h>

int main () {

    int x = 20;
    int* xPointer = &x;

    printf("x address: `%p`\n", &x);
    printf("x value: `%u`\n", x);
    printf("pointer address: `%p`\n", &xPointer);
    printf("pointer reference: `%p`\n", xPointer); // equals the address of x
    printf("pointer reference value: `%u`\n", *xPointer);

    *xPointer = 420;
    printf("x value: `%u`\n", x);
    printf("pointer reference value: `%u`\n", *xPointer);

    x = 69;
    printf("x value: `%u`\n", x);
    printf("pointer reference value: `%u`\n", *xPointer);

    return 0;
}

对应的swift:

var x = 5

// raw pointer
var p: UnsafeMutablePointer<Int> = .init(&x)

print("x address: ", UnsafeRawPointer(&x))
print("x value: ", x)
print("pointer address: ", UnsafeRawPointer(&p)) // pointer to pointer
print("pointer reference: ", p) // = &x
print("point reference value: ", p.pointee) // *p

p.pointee = 7
print(x) // 7

x = 9
print(p.pointee) // 9

Data

A byte buffer in memory.

char* buffer的抽象。

// String to Data
let data = Data("Hello, world!".utf8)

// Data to String
let string = String(decoding: data, as: UTF8.self)

Data和Array<UInt8>的区别:Data用来表达byte buffer,但Data是Immutable,虽然有MutableData,这单说,所以Data更多的是用来做持久化和传递,而Array<UInt8>因为内存layout和C一致,是一个Swift基础对象,内存透明,所以可以直接withUnsafePointer。

此外注意一个坑,

    let data: Data = .init(bytes: [0x00, 0x0FF], count: 2) // wrong!

这句看上去没问题,但其实是错误的,Data的第一个bytes参数是一个UnsafeRawPointer, 然后第二个count参数单位是byte (The number of bytes to copy.)

  • 第一个概念: Data的所有单位是byte (UInt8, 即8个bit),和类型无关
  • [0x00, 0xFF]是什么,是一个Array<Int>,在swift中一个Int表示64bit,所以0x00表示0x0000_0000_0000_0000,即8个字节,[0x00, 0xFF]这个占用了16个字节,但是count只取前2个字节,显然和预期不符

正确的写法:

let data = Data(bytes: [UInt8]([0x00, 0x01, 0x02, 0x03]), count: 4)
print(data.hexString()) // 00010203

Data+Extension.swift

import Foundation

extension Data {
  var hexString: String {
    return map { String(format: "%02hhx", $0) }.joined(separator: " ")
  }
}

UnsafeRawBufferPointer

data.withUnsafeBytes { buffer in
  for byte in buffer {
    print(byte)
  }
  print(buffer.baseAddress)
  print(buffer.count)
}

重点参考: 在 Swift 裡頭操作 Bytes | zonble

  • Data本质上就是Array<UInt8>的封装

参考内容

nonocast avatar May 23 '22 04:05 nonocast