Both Character Literals and String Literals Can Be Assigned to a Char Variable
Strings and Characters¶
A string is a serial of characters, such as "hullo, world"
or "albatross"
. Swift strings are represented by the String
type. The contents of a String
tin exist accessed in various ways, including as a collection of Character
values.
Swift's Cord
and Graphic symbol
types provide a fast, Unicode-compliant mode to work with text in your code. The syntax for string cosmos and manipulation is lightweight and readable, with a cord literal syntax that'south similar to C. String chain is every bit elementary as combining 2 strings with the +
operator, and string mutability is managed by choosing between a abiding or a variable, but like any other value in Swift. You can too apply strings to insert constants, variables, literals, and expressions into longer strings, in a process known as string interpolation. This makes it like shooting fish in a barrel to create custom string values for display, storage, and press.
Despite this simplicity of syntax, Swift's String
blazon is a fast, modern string implementation. Every string is equanimous of encoding-independent Unicode characters, and provides support for accessing those characters in various Unicode representations.
Notation
Swift'due south String
type is bridged with Foundation's NSString
class. Foundation besides extends Cord
to expose methods defined by NSString
. This ways, if you import Foundation, y'all can access those NSString
methods on Cord
without casting.
For more than data about using String
with Foundation and Cocoa, see Bridging Between Cord and NSString.
String Literals¶
You can include predefined String
values within your lawmaking as string literals. A string literal is a sequence of characters surrounded by double quotation marks ( "
).
Use a cord literal as an initial value for a constant or variable:
- let someString = "Some string literal value"
Notation that Swift infers a blazon of Cord
for the someString
constant because it's initialized with a cord literal value.
Multiline String Literals¶
If you need a string that spans several lines, utilize a multiline string literal—a sequence of characters surrounded by three double quotation marks:
- allow quotation = """
- The White Rabbit put on his glasses. "Where shall I begin,
- please your Majesty?" he asked.
- "Brainstorm at the start," the King said gravely, "and keep
- till you come to the end; then stop."
- """
A multiline string literal includes all of the lines between its opening and endmost quotation marks. The cord begins on the first line after the opening quotation marks ( """
) and ends on the line before the closing quotation marks, which means that neither of the strings below start or cease with a line pause:
- let singleLineString = "These are the same."
- let multilineString = """
- These are the aforementioned.
- """
When your source code includes a line pause inside of a multiline cord literal, that line pause likewise appears in the string's value. If you want to use line breaks to make your source lawmaking easier to read, simply yous don't desire the line breaks to be part of the string'south value, write a backslash ( \
) at the end of those lines:
- let softWrappedQuotation = """
- The White Rabbit put on his spectacles. "Where shall I begin, \
- please your Majesty?" he asked.
- "Begin at the beginning," the King said gravely, "and get on \
- till you come to the stop; then stop."
- """
To make a multiline string literal that begins or ends with a line feed, write a blank line as the showtime or last line. For example:
- permit lineBreaks = """
- This string starts with a line break.
- Information technology also ends with a line break.
- """
A multiline cord can exist indented to match the surrounding code. The whitespace before the closing quotation marks ( """
) tells Swift what whitespace to ignore before all of the other lines. However, if y'all write whitespace at the showtime of a line in add-on to what's before the closing quotation marks, that whitespace is included.
In the instance above, fifty-fifty though the entire multiline string literal is indented, the showtime and last lines in the string don't brainstorm with any whitespace. The heart line has more indentation than the closing quotation marks, and then information technology starts with that extra iv-space indentation.
Special Characters in String Literals¶
String literals can include the following special characters:
- The escaped special characters
\0
(nil character),\\
(backslash),\t
(horizontal tab),\n
(line feed),\r
(carriage return),\"
(double quotation mark) and\'
(unmarried quotation mark) - An arbitrary Unicode scalar value, written every bit
\u{
n}
, where northward is a i–8 digit hexadecimal number (Unicode is discussed in Unicode beneath)
The code below shows four examples of these special characters. The wiseWords
constant contains ii escaped double quotation marks. The dollarSign
, blackHeart
, and sparklingHeart
constants demonstrate the Unicode scalar format:
- permit wiseWords = "\"Imagination is more important than cognition\" - Einstein"
- // "Imagination is more important than noesis" - Einstein
- let dollarSign = "\u{24}" // $, Unicode scalar U+0024
- let blackHeart = "\u{2665}" // ♥, Unicode scalar U+2665
- let sparklingHeart = "\u{1F496}" // 💖, Unicode scalar U+1F496
Because multiline cord literals utilize three double quotation marks instead of simply one, yous tin include a double quotation marking ( "
) within of a multiline string literal without escaping it. To include the text """
in a multiline string, escape at least one of the quotation marks. For case:
- let threeDoubleQuotationMarks = """
- Escaping the get-go quotation mark \"""
- Escaping all three quotation marks \"\"\"
- """
Extended String Delimiters¶
You lot can place a cord literal within extended delimiters to include special characters in a string without invoking their effect. You place your cord inside quotation marks ( "
) and surround that with number signs ( #
). For instance, printing the cord literal #"Line 1\nLine 2"#
prints the line feed escape sequence ( \n
) rather than printing the string beyond 2 lines.
If you lot demand the special effects of a character in a string literal, lucifer the number of number signs within the string following the escape grapheme ( \
). For instance, if your string is #"Line one\nLine 2"#
and y'all desire to pause the line, you lot can use #"Line ane\#nLine 2"#
instead. Similarly, ###"Line1\###nLine2"###
likewise breaks the line.
String literals created using extended delimiters can also be multiline string literals. You can utilize extended delimiters to include the text """
in a multiline cord, overriding the default beliefs that ends the literal. For example:
- let threeMoreDoubleQuotationMarks = #"""
- Here are iii more than double quotes: """
- """#
Initializing an Empty String¶
To create an empty String
value equally the starting point for edifice a longer string, either assign an empty string literal to a variable, or initialize a new String
instance with initializer syntax:
- var emptyString = "" // empty string literal
- var anotherEmptyString = String() // initializer syntax
- // these two strings are both empty, and are equivalent to each other
Find out whether a String
value is empty by checking its Boolean isEmpty
property:
- if emptyString.isEmpty {
- print("Cypher to run into here")
- }
- // Prints "Nothing to run into here"
String Mutability¶
You signal whether a particular String
can be modified (or mutated) by assigning it to a variable (in which case it can be modified), or to a constant (in which case it can't be modified):
- var variableString = "Horse"
- variableString += " and carriage"
- // variableString is at present "Horse and wagon"
- let constantString = "Highlander"
- constantString += " and another Highlander"
- // this reports a compile-time error - a constant string cannot be modified
Note
This arroyo is different from string mutation in Objective-C and Cocoa, where you lot choose between two classes ( NSString
and NSMutableString
) to bespeak whether a cord tin be mutated.
Strings Are Value Types¶
Swift'due south String
blazon is a value blazon. If y'all create a new String
value, that String
value is copied when it'south passed to a role or method, or when it'southward assigned to a constant or variable. In each case, a new re-create of the existing String
value is created, and the new re-create is passed or assigned, not the original version. Value types are described in Structures and Enumerations Are Value Types.
Swift's copy-past-default String
behavior ensures that when a function or method passes you a String
value, information technology's clear that y'all own that verbal String
value, regardless of where it came from. You can be confident that the string you are passed won't be modified unless you change it yourself.
Behind the scenes, Swift's compiler optimizes cord usage so that actual copying takes place only when absolutely necessary. This means you always become neat performance when working with strings as value types.
Working with Characters¶
You tin can access the individual Character
values for a Cord
by iterating over the cord with a for
- in
loop:
- for grapheme in "Dog!🐶" {
- print(graphic symbol)
- }
- // D
- // o
- // k
- // !
- // 🐶
The for
- in
loop is described in For-In Loops.
Alternatively, you can create a stand-lone Character
constant or variable from a unmarried-character string literal by providing a Character
type note:
- let exclamationMark: Graphic symbol = "!"
Cord
values can exist constructed by passing an array of Character
values every bit an argument to its initializer:
- let catCharacters: [Character] = ["C", "a", "t", "!", "🐱"]
- let catString = String(catCharacters)
- print(catString)
- // Prints "Cat!🐱"
Concatenating Strings and Characters¶
String
values can be added together (or concatenated) with the addition operator ( +
) to create a new String
value:
- permit string1 = "hello"
- let string2 = " there"
- var welcome = string1 + string2
- // welcome at present equals "hello in that location"
Y'all tin also append a String
value to an existing String
variable with the addition assignment operator ( +=
):
- var education = "look over"
- education += string2
- // instruction now equals "look over at that place"
You can append a Graphic symbol
value to a String
variable with the String
type'south append()
method:
- let exclamationMark: Character = "!"
- welcome.suspend(exclamationMark)
- // welcome now equals "hello there!"
Note
Yous can't suspend a Cord
or Character
to an existing Graphic symbol
variable, because a Character
value must contain a single character simply.
If you're using multiline string literals to build up the lines of a longer string, you want every line in the cord to end with a line intermission, including the last line. For example:
- permit badStart = """
- one
- two
- """
- permit end = """
- three
- """
- print(badStart + end)
- // Prints two lines:
- // ane
- // twothree
- let goodStart = """
- 1
- ii
- """
- print(goodStart + end)
- // Prints three lines:
- // 1
- // 2
- // three
In the code above, concatenating badStart
with end
produces a 2-line string, which isn't the desired result. Considering the last line of badStart
doesn't end with a line intermission, that line gets combined with the beginning line of terminate
. In dissimilarity, both lines of goodStart
end with a line intermission, so when it's combined with end
the result has 3 lines, as expected.
Cord Interpolation¶
String interpolation is a way to construct a new Cord
value from a mix of constants, variables, literals, and expressions past including their values inside a cord literal. You can utilize string interpolation in both single-line and multiline cord literals. Each item that you insert into the string literal is wrapped in a pair of parentheses, prefixed by a backslash ( \
):
- let multiplier = 3
- let message = "\( multiplier ) times ii.5 is \( Double(multiplier) * two.v ) "
- // bulletin is "3 times two.5 is 7.5"
In the instance above, the value of multiplier
is inserted into a string literal as \(multiplier)
. This placeholder is replaced with the bodily value of multiplier
when the string interpolation is evaluated to create an actual cord.
The value of multiplier
is also part of a larger expression later on in the string. This expression calculates the value of Double(multiplier) * 2.five
and inserts the consequence ( 7.5
) into the string. In this case, the expression is written as \(Double(multiplier) * 2.5)
when information technology's included inside the string literal.
Y'all can use extended string delimiters to create strings containing characters that would otherwise be treated every bit a string interpolation. For example:
- print(#"Write an interpolated string in Swift using \(multiplier)."#)
- // Prints "Write an interpolated string in Swift using \(multiplier)."
To utilize string interpolation inside a string that uses extended delimiters, friction match the number of number signs afterward the backslash to the number of number signs at the beginning and finish of the string. For example:
- print(#"6 times vii is \#( 6 * vii ) ."#)
- // Prints "half dozen times 7 is 42."
Note
The expressions you write inside parentheses within an interpolated string tin't contain an unescaped backslash ( \
), a railroad vehicle return, or a line feed. However, they can incorporate other string literals.
Unicode¶
Unicode is an international standard for encoding, representing, and processing text in unlike writing systems. It enables you to represent almost any character from any language in a standardized course, and to read and write those characters to and from an external source such every bit a text file or web page. Swift's String
and Graphic symbol
types are fully Unicode-compliant, as described in this section.
Unicode Scalar Values¶
Behind the scenes, Swift'due south native String
type is congenital from Unicode scalar values. A Unicode scalar value is a unique 21-bit number for a character or modifier, such as U+0061
for LATIN SMALL LETTER A
( "a"
), or U+1F425
for Forepart-FACING Infant CHICK
( "🐥"
).
Notation that not all 21-bit Unicode scalar values are assigned to a grapheme—some scalars are reserved for future consignment or for use in UTF-16 encoding. Scalar values that have been assigned to a character typically also accept a name, such as LATIN Pocket-sized LETTER A
and Front-FACING Baby CHICK
in the examples above.
Extended Grapheme Clusters¶
Every example of Swift's Character
type represents a single extended graphic symbol cluster. An extended character cluster is a sequence of one or more than Unicode scalars that (when combined) produce a single man-readable character.
Here'southward an instance. The alphabetic character é
tin be represented as the single Unicode scalar é
( LATIN Small-scale LETTER East WITH Astute
, or U+00E9
). However, the same letter tin can also exist represented as a pair of scalars—a standard letter of the alphabet e
( LATIN Minor LETTER E
, or U+0065
), followed by the COMBINING Acute Accent
scalar ( U+0301
). The COMBINING Acute Emphasis
scalar is graphically practical to the scalar that precedes information technology, turning an e
into an é
when it's rendered by a Unicode-aware text-rendering organization.
In both cases, the letter é
is represented as a unmarried Swift Character
value that represents an extended grapheme cluster. In the first case, the cluster contains a single scalar; in the 2nd instance, it'due south a cluster of two scalars:
- allow eAcute: Graphic symbol = "\u{E9}" // é
- permit combinedEAcute: Grapheme = "\u{65}\u{301}" // e followed past ́
- // eAcute is é, combinedEAcute is é
Extended grapheme clusters are a flexible fashion to represent many complex script characters every bit a unmarried Character
value. For example, Hangul syllables from the Korean alphabet tin can be represented as either a precomposed or decomposed sequence. Both of these representations qualify as a single Character
value in Swift:
- permit precomposed: Character = "\u{D55C}" // 한
- let decomposed: Character = "\u{1112}\u{1161}\u{11AB}" // ᄒ, ᅡ, ᆫ
- // precomposed is 한, decomposed is 한
Extended character clusters enable scalars for enclosing marks (such as COMBINING ENCLOSING CIRCLE
, or U+20DD
) to enclose other Unicode scalars as office of a single Character
value:
- allow enclosedEAcute: Character = "\u{E9}\u{20DD}"
- // enclosedEAcute is é⃝
Unicode scalars for regional indicator symbols can exist combined in pairs to make a unmarried Graphic symbol
value, such every bit this combination of REGIONAL INDICATOR SYMBOL Letter of the alphabet U
( U+1F1FA
) and REGIONAL INDICATOR SYMBOL Alphabetic character South
( U+1F1F8
):
- allow regionalIndicatorForUS: Grapheme = "\u{1F1FA}\u{1F1F8}"
- // regionalIndicatorForUS is 🇺🇸
Counting Characters¶
To recollect a count of the Character
values in a string, use the count
property of the string:
- let unusualMenagerie = "Koala 🐨, Snail 🐌, Penguin 🐧, Dromedary 🐪"
- impress("unusualMenagerie has \( unusualMenagerie.count ) characters")
- // Prints "unusualMenagerie has 40 characters"
Notation that Swift'southward use of extended character clusters for Character
values ways that cord concatenation and modification may non always affect a string's grapheme count.
For instance, if you initialize a new cord with the four-character word cafe
, and then suspend a COMBINING ACUTE Emphasis
( U+0301
) to the finish of the string, the resulting string volition still have a character count of 4
, with a fourth character of é
, not e
:
- var discussion = "cafe"
- print("the number of characters in \( word ) is \( give-and-take.count ) ")
- // Prints "the number of characters in cafe is four"
- word += "\u{301}" // COMBINING Acute Accent, U+0301
- print("the number of characters in \( discussion ) is \( give-and-take.count ) ")
- // Prints "the number of characters in café is 4"
Notation
Extended grapheme clusters can be composed of multiple Unicode scalars. This means that different characters—and different representations of the same grapheme—can require unlike amounts of memory to store. Because of this, characters in Swift don't each accept up the same corporeality of memory inside a string's representation. Every bit a result, the number of characters in a string can't exist calculated without iterating through the cord to determine its extended grapheme cluster boundaries. If you are working with particularly long string values, be aware that the count
belongings must iterate over the Unicode scalars in the entire string in order to determine the characters for that cord.
The count of the characters returned by the count
property isn't always the same as the length
property of an NSString
that contains the same characters. The length of an NSString
is based on the number of sixteen-bit code units within the string'due south UTF-xvi representation and not the number of Unicode extended grapheme clusters within the string.
Accessing and Modifying a String¶
You access and alter a string through its methods and properties, or past using subscript syntax.
String Indices¶
Each String
value has an associated index type, String.Index
, which corresponds to the position of each Graphic symbol
in the string.
As mentioned above, different characters can crave different amounts of memory to store, so in lodge to determine which Character
is at a particular position, you lot must iterate over each Unicode scalar from the starting time or end of that String
. For this reason, Swift strings tin can't be indexed by integer values.
Utilize the startIndex
property to access the position of the start Character
of a String
. The endIndex
holding is the position afterwards the final character in a String
. As a event, the endIndex
property isn't a valid argument to a string's subscript. If a Cord
is empty, startIndex
and endIndex
are equal.
You access the indices earlier and after a given alphabetize using the index(earlier:)
and index(afterwards:)
methods of Cord
. To access an alphabetize farther away from the given alphabetize, y'all can use the index(_:offsetBy:)
method instead of calling one of these methods multiple times.
You tin use subscript syntax to admission the Character
at a particular String
alphabetize.
- let greeting = "Guten Tag!"
- greeting[greeting.startIndex]
- // G
- greeting[greeting.index(before: greeting.endIndex)]
- // !
- greeting[greeting.alphabetize(after: greeting.startIndex)]
- // u
- let index = greeting.index(greeting.startIndex, offsetBy: 7)
- greeting[alphabetize]
- // a
Attempting to admission an index outside of a cord's range or a Character
at an index outside of a cord's range will trigger a runtime fault.
- greeting[greeting.endIndex] // Fault
- greeting.index(after: greeting.endIndex) // Fault
Employ the indices
property to access all of the indices of private characters in a string.
- for index in greeting.indices {
- print("\( greeting[index]) ", terminator: "")
- }
- // Prints "G u t east n T a thou ! "
Notation
Yous can utilise the startIndex
and endIndex
properties and the index(before:)
, index(afterwards:)
, and alphabetize(_:offsetBy:)
methods on any blazon that conforms to the Collection
protocol. This includes String
, as shown here, likewise every bit collection types such as Array
, Dictionary
, and Fix
.
Inserting and Removing¶
To insert a single character into a string at a specified alphabetize, utilize the insert(_:at:)
method, and to insert the contents of some other string at a specified index, use the insert(contentsOf:at:)
method.
- var welcome = "hullo"
- welcome.insert("!", at: welcome.endIndex)
- // welcome now equals "hullo!"
- welcome.insert(contentsOf: " at that place", at: welcome.alphabetize(before: welcome.endIndex))
- // welcome at present equals "hello there!"
To remove a single graphic symbol from a string at a specified index, use the remove(at:)
method, and to remove a substring at a specified range, utilise the removeSubrange(_:)
method:
- welcome.remove(at: welcome.index(before: welcome.endIndex))
- // welcome now equals "hello in that location"
- allow range = welcome.alphabetize(welcome.endIndex, offsetBy: -6)..<welcome.endIndex
- welcome.removeSubrange(range)
- // welcome now equals "hello"
Note
You can use the insert(_:at:)
, insert(contentsOf:at:)
, remove(at:)
, and removeSubrange(_:)
methods on any type that conforms to the RangeReplaceableCollection
protocol. This includes Cord
, equally shown here, too as collection types such as Array
, Dictionary
, and Ready
.
Substrings¶
When y'all get a substring from a string—for example, using a subscript or a method like prefix(_:)
—the issue is an instance of Substring
, not another string. Substrings in Swift take nearly of the aforementioned methods as strings, which ways you tin can work with substrings the same way you work with strings. Yet, unlike strings, you use substrings for only a curt amount of time while performing actions on a string. When y'all're ready to store the issue for a longer time, you convert the substring to an instance of String
. For example:
- let greeting = "Hi, globe!"
- allow alphabetize = greeting.firstIndex(of: ",") ?? greeting.endIndex
- let beginning = greeting[..<alphabetize]
- // beginning is "Hello"
- // Convert the result to a String for long-term storage.
- let newString = String(beginning)
Like strings, each substring has a region of retention where the characters that make up the substring are stored. The difference betwixt strings and substrings is that, as a performance optimization, a substring can reuse part of the memory that's used to shop the original string, or part of the memory that's used to store another substring. (Strings have a similar optimization, only if two strings share memory, they're equal.) This performance optimization means you don't have to pay the performance cost of copying memory until yous modify either the string or substring. As mentioned above, substrings aren't suitable for long-term storage—because they reuse the storage of the original string, the entire original string must be kept in memory as long every bit any of its substrings are existence used.
In the case in a higher place, greeting
is a string, which means information technology has a region of retention where the characters that make upwardly the string are stored. Because outset
is a substring of greeting
, information technology reuses the retention that greeting
uses. In contrast, newString
is a string—when it'southward created from the substring, it has its own storage. The effigy below shows these relationships:
Annotation
Both String
and Substring
conform to the StringProtocol
protocol, which means it'due south frequently convenient for string-manipulation functions to accept a StringProtocol
value. You can phone call such functions with either a String
or Substring
value.
Comparing Strings¶
Swift provides three ways to compare textual values: string and character equality, prefix equality, and suffix equality.
Cord and Graphic symbol Equality¶
String and character equality is checked with the "equal to" operator ( ==
) and the "not equal to" operator ( !=
), as described in Comparison Operators:
- let quotation = "We're a lot alike, y'all and I."
- let sameQuotation = "We're a lot alike, you and I."
- if quotation == sameQuotation {
- impress("These two strings are considered equal")
- }
- // Prints "These two strings are considered equal"
Ii Cord
values (or ii Grapheme
values) are considered equal if their extended grapheme clusters are canonically equivalent. Extended grapheme clusters are canonically equivalent if they take the same linguistic pregnant and appearance, fifty-fifty if they're composed from different Unicode scalars behind the scenes.
For instance, LATIN SMALL LETTER E WITH ACUTE
( U+00E9
) is canonically equivalent to LATIN SMALL LETTER E
( U+0065
) followed by COMBINING ACUTE ACCENT
( U+0301
). Both of these extended grapheme clusters are valid ways to stand for the character é
, and so they're considered to exist canonically equivalent:
- // "Voulez-vous united nations café?" using LATIN SMALL LETTER E WITH Astute
- let eAcuteQuestion = "Voulez-vous un caf\u{E9}?"
- // "Voulez-vous united nations café?" using LATIN Modest Letter of the alphabet Eastward and COMBINING ACUTE ACCENT
- permit combinedEAcuteQuestion = "Voulez-vous un caf\u{65}\u{301}?"
- if eAcuteQuestion == combinedEAcuteQuestion {
- print("These 2 strings are considered equal")
- }
- // Prints "These two strings are considered equal"
Conversely, LATIN Capital letter Letter A
( U+0041
, or "A"
), as used in English, is non equivalent to CYRILLIC CAPITAL LETTER A
( U+0410
, or "А"
), as used in Russian. The characters are visually similar, only don't take the same linguistic meaning:
- let latinCapitalLetterA: Character = "\u{41}"
- let cyrillicCapitalLetterA: Grapheme = "\u{0410}"
- if latinCapitalLetterA != cyrillicCapitalLetterA {
- print("These two characters aren't equivalent.")
- }
- // Prints "These two characters aren't equivalent."
Annotation
String and character comparisons in Swift aren't locale-sensitive.
Prefix and Suffix Equality¶
To bank check whether a string has a detail string prefix or suffix, call the string'southward hasPrefix(_:)
and hasSuffix(_:)
methods, both of which have a single argument of type Cord
and return a Boolean value.
The examples below consider an assortment of strings representing the scene locations from the showtime two acts of Shakespeare's Romeo and Juliet:
- let romeoAndJuliet = [
- "Act 1 Scene 1: Verona, A public identify",
- "Act i Scene 2: Capulet's mansion",
- "Act i Scene 3: A room in Capulet's mansion",
- "Act 1 Scene 4: A street outside Capulet's mansion",
- "Deed 1 Scene 5: The Great Hall in Capulet's mansion",
- "Act 2 Scene 1: Outside Capulet'south mansion",
- "Act two Scene two: Capulet's orchard",
- "Act 2 Scene 3: Outside Friar Lawrence's jail cell",
- "Act 2 Scene 4: A street in Verona",
- "Act 2 Scene v: Capulet's mansion",
- "Act two Scene 6: Friar Lawrence's jail cell"
- ]
You tin employ the hasPrefix(_:)
method with the romeoAndJuliet
array to count the number of scenes in Act one of the play:
- var act1SceneCount = 0
- for scene in romeoAndJuliet {
- if scene.hasPrefix("Act 1 ") {
- act1SceneCount += 1
- }
- }
- impress("There are \( act1SceneCount ) scenes in Deed ane")
- // Prints "There are five scenes in Human activity 1"
Similarly, use the hasSuffix(_:)
method to count the number of scenes that take identify in or around Capulet'due south mansion and Friar Lawrence's prison cell:
- var mansionCount = 0
- var cellCount = 0
- for scene in romeoAndJuliet {
- if scene.hasSuffix("Capulet's mansion") {
- mansionCount += ane
- } else if scene.hasSuffix("Friar Lawrence'southward prison cell") {
- cellCount += 1
- }
- }
- print("\( mansionCount ) mansion scenes; \( cellCount ) cell scenes")
- // Prints "half dozen mansion scenes; 2 cell scenes"
Annotation
The hasPrefix(_:)
and hasSuffix(_:)
methods perform a character-past-grapheme canonical equivalence comparison between the extended character clusters in each cord, as described in String and Character Equality.
Unicode Representations of Strings¶
When a Unicode string is written to a text file or some other storage, the Unicode scalars in that string are encoded in one of several Unicode-defined encoding forms. Each grade encodes the string in small chunks known every bit lawmaking units. These include the UTF-8 encoding course (which encodes a string as 8-scrap code units), the UTF-16 encoding form (which encodes a string as 16-fleck code units), and the UTF-32 encoding form (which encodes a string every bit 32-fleck code units).
Swift provides several different ways to access Unicode representations of strings. You lot can iterate over the string with a for
- in
statement, to access its private Grapheme
values as Unicode extended grapheme clusters. This process is described in Working with Characters.
Alternatively, admission a String
value in ane of iii other Unicode-compliant representations:
- A drove of UTF-eight code units (accessed with the cord's
utf8
holding) - A collection of UTF-16 lawmaking units (accessed with the string's
utf16
property) - A collection of 21-bit Unicode scalar values, equivalent to the string's UTF-32 encoding form (accessed with the cord's
unicodeScalars
holding)
Each example below shows a different representation of the post-obit string, which is made up of the characters D
, o
, thousand
, ‼
( DOUBLE Exclamation MARK
, or Unicode scalar U+203C
), and the 🐶 grapheme ( DOG FACE
, or Unicode scalar U+1F436
):
- let dogString = "Dog‼🐶"
UTF-8 Representation¶
Yous can access a UTF-eight representation of a String
by iterating over its utf8
property. This belongings is of type Cord.UTF8View
, which is a collection of unsigned 8-flake ( UInt8
) values, ane for each byte in the string's UTF-8 representation:
- for codeUnit in dogString.utf8 {
- impress("\( codeUnit ) ", terminator: "")
- }
- print("")
- // Prints "68 111 103 226 128 188 240 159 144 182 "
In the example in a higher place, the get-go iii decimal codeUnit
values ( 68
, 111
, 103
) stand for the characters D
, o
, and g
, whose UTF-viii representation is the aforementioned as their ASCII representation. The next 3 decimal codeUnit
values ( 226
, 128
, 188
) are a three-byte UTF-viii representation of the DOUBLE Exclamation Marking
character. The concluding four codeUnit
values ( 240
, 159
, 144
, 182
) are a four-byte UTF-viii representation of the Dog Face up
character.
UTF-16 Representation¶
You tin can access a UTF-sixteen representation of a String
past iterating over its utf16
property. This property is of blazon String.UTF16View
, which is a collection of unsigned 16-bit ( UInt16
) values, ane for each xvi-bit code unit in the cord'south UTF-xvi representation:
- for codeUnit in dogString.utf16 {
- print("\( codeUnit ) ", terminator: "")
- }
- impress("")
- // Prints "68 111 103 8252 55357 56374 "
Again, the showtime three codeUnit
values ( 68
, 111
, 103
) represent the characters D
, o
, and g
, whose UTF-16 code units take the same values as in the string's UTF-8 representation (considering these Unicode scalars correspond ASCII characters).
The fourth codeUnit
value ( 8252
) is a decimal equivalent of the hexadecimal value 203C
, which represents the Unicode scalar U+203C
for the DOUBLE EXCLAMATION Marker
graphic symbol. This character tin exist represented equally a single lawmaking unit in UTF-16.
The fifth and 6th codeUnit
values ( 55357
and 56374
) are a UTF-16 surrogate pair representation of the DOG Confront
character. These values are a high-surrogate value of U+D83D
(decimal value 55357
) and a low-surrogate value of U+DC36
(decimal value 56374
).
Unicode Scalar Representation¶
You tin access a Unicode scalar representation of a String
value by iterating over its unicodeScalars
property. This property is of type UnicodeScalarView
, which is a collection of values of type UnicodeScalar
.
Each UnicodeScalar
has a value
property that returns the scalar'south 21-bit value, represented within a UInt32
value:
- for scalar in dogString.unicodeScalars {
- print("\( scalar.value ) ", terminator: "")
- }
- print("")
- // Prints "68 111 103 8252 128054 "
The value
backdrop for the showtime iii UnicodeScalar
values ( 68
, 111
, 103
) once over again stand for the characters D
, o
, and g
.
The fourth codeUnit
value ( 8252
) is once again a decimal equivalent of the hexadecimal value 203C
, which represents the Unicode scalar U+203C
for the DOUBLE EXCLAMATION MARK
character.
The value
belongings of the fifth and last UnicodeScalar
, 128054
, is a decimal equivalent of the hexadecimal value 1F436
, which represents the Unicode scalar U+1F436
for the Dog Confront
character.
As an culling to querying their value
backdrop, each UnicodeScalar
value can also be used to construct a new String
value, such equally with string interpolation:
- for scalar in dogString.unicodeScalars {
- print("\( scalar ) ")
- }
- // D
- // o
- // 1000
- // ‼
- // 🐶
williamsbeetect1972.blogspot.com
Source: https://docs.swift.org/swift-book/LanguageGuide/StringsAndCharacters.html
0 Response to "Both Character Literals and String Literals Can Be Assigned to a Char Variable"
Post a Comment