The Slice Type in Rust
In this lesson, we will look into the slice type in Rust.
GitHub repo with all the code
https://github.com/codetit4n/rust-school
For this lesson: https://github.com/codetit4n/rust-school/tree/main/lesson-8
Make sure to star/fork/watch it on GitHub.
Prerequisites
To understand this article you must be familiar with the concept of ownership and references in Rust.
The Slice Type
This is another kind of reference. Which means it will have no ownership.
Definition: Slices let you reference a contiguous sequence of elements in a collection rather than the whole collection. Let's try to understand this:
What problem does it solve?
Consider a problem:
Write a function that takes a string of words separated by spaces and returns the first word it finds in that string. If the function doesn’t find a space in the string, the whole string must be one word, so the entire string should be returned.
Consider the code:
fn main() {
let mut s = String::from("hello world");
let word = first_word(&s);
s.clear();
println!("{}", word);
}
fn first_word(s: &String) -> usize {
let bytes = s.as_bytes();
for (i, &item) in bytes.iter().enumerate() {
if item == b' ' {
return i;
}
}
s.len()
}
In the above example, we are returning the index of the end of the first word, indicated by a space.
Note: For now, we can ignore how the first_word
function is working. Because it is using things like iterators
, enumerate
, etc which we will learn about in upcoming articles.
For now, let's focus on the signature of the first_word
function:
fn first_word(s: &String) -> usize {
Here, we are taking reference to the string as an argument, which is fine because we do not want ownership of the variable in this function.
let mut s = String::from("hello world");
let word = first_word(&s);
Here, we are passing the mutable reference to the String
s in the function first_word
which should return the index of the first space. If we print the value of word
after this it will give: 5
This is all right but the story does not end here. After this we are doing:
s.clear();
println!("{}", word);
s.clear();
will empty the string and make it ""
. After this when we are printing the value of word
, we get:
But the problem is that the underlying string is not present at this moment. So, there's no more string that we could meaningfully use the value 5 with. word is now totally invalid!
Can you see the problem with this?
This program compiles without any errors and would also do so if we used word
after calling s.clear()
. Because word
isn’t connected to the state of s
at all, word
still contains the value 5
. We could use that value 5
with the variable s
to try to extract the first word out, but this would be a bug because the contents of s
have changed since we saved 5
in word
.
Having to worry about the index in word
getting out of sync with the data in s
is tedious and error prone!
Managing these indices is even more brittle if we write a second_word
function. Its signature would have to look like this:
fn second_word(s: &String) -> (usize, usize) {
Now we’re tracking a starting and an ending index, and we have even more values that were calculated from data in a particular state but aren’t tied to that state at all. We have three unrelated variables floating around that need to be kept in sync.
Rust has a solution to this problem: string slices
String Slices
A string slice is a reference to part of a String
, and it looks like this:
let s1 = String::from("hello world");
let hello = &s1[0..5];
let world = &s1[6..11];
println!("{hello} {world}");
Rather than a reference to the entire String
, hello
is a reference to a portion of the String
, specified in the extra [0..5]
bit. We create slices using a range within brackets by specifying [starting_index..ending_index]
, where starting_index
is the first position in the slice and ending_index
is one more than the last position in the slice.
Side note: Internally, the slice data structure stores the starting position and the length of the slice, which corresponds to ending_index
minus starting_index
. So, in the case of let world = &s[6..11];
, world
would be a slice that contains a pointer to the byte at index 6 of s
with a length value of 5
.
Under the hood
Let's see what is happening under the hood:
Here, you can see we have created a string slice which is a completely new reference called world
Ranges in Rust
In our above example, we were using Rust’s ..
range syntax.
Let's look into some basic usage of Ranges. If you want to start at index 0, you can drop the value before the two periods. So, the following are the same:
let s2 = String::from("hello");
let slice = &s2[0..2];
let slice = &s2[..2];
By the same token, if your slice includes the last byte of the String
, you can drop the trailing number. That means these are equal:
let len = s2.len();
let slice = &s2[3..len];
let slice = &s2[3..];
You can also drop both values to take a slice of the entire string. So these are equal:
let slice = &s2[0..len];
let slice = &s2[..];
More on String Slices
Now, let's fix the problem we discussed earlier with string slices. Let’s rewrite first_word
to return a slice. The type that signifies “string slice” is written as &str
:
fn first_word(s: &String) -> &str {
let bytes = s.as_bytes();
for (i, &item) in bytes.iter().enumerate() {
if item == b' ' {
return &s[0..i];
}
}
&s[..]
}
Here notice the signature of the function:
fn first_word(s: &String) -> &str {
In this code, we are returning the string slice of the first word.
Now when we call first_word
, we get back a single value that is tied to the underlying data. The value is made up of a reference to the starting point of the slice and the number of elements in the slice.
We now have a straightforward API that’s much harder to mess up because the compiler will ensure the references into the String
remain valid.
Now, if I compile the following code, It will give error:
fn main() {
let mut s = String::from("hello world");
let word = first_word(&s);
s.clear(); //will produce error
println!("{}", word);
}
We get:
Recall from the borrowing rules that if we have an immutable reference to something, we cannot also take a mutable reference. Because clear
needs to truncate the String
, it needs to get a mutable reference. The println!
after the call to clear
uses the reference in word
, so the immutable reference must still be active at that point. Rust disallows the mutable reference in clear
and the immutable reference in word
from existing at the same time, and compilation fails.
Not only has Rust made our API easier to use, but it has also eliminated an entire class of errors at compile time!
String Literals as Slices
We have talked about String literals in earlier articles. A string literal looks like:
let s = "Hello, world!";
The type of s
here is &str
: It is what the Rust compiler infers for s
it’s a slice pointing to that specific point of the binary/executable. This is also why string literals are immutable; &str
is an immutable reference.
String Slices as Parameters
We can also use string slices as parameters. So, our first_word
function signature will now look like this:
fn first_word(s: &str) -> &str {
A more experienced Rustacean would write the signature like this instead because it allows us to use the same function on both &String
values and &str
values. If we have a string slice, we can pass that directly. If we have a String
, we can pass a slice of the String
or a reference to the String
.
Defining a function to take a string slice instead of a reference to a String
makes our API more general and useful without losing any functionality.
Few examples from the Rust book:
fn main() {
let my_string = String::from("hello world");
// `first_word` works on slices of `String`s, whether partial or whole
let word = first_word(&my_string[0..6]);
let word = first_word(&my_string[..]);
// `first_word` also works on references to `String`s, which are equivalent
// to whole slices of `String`s
let word = first_word(&my_string);
let my_string_literal = "hello world";
// `first_word` works on slices of string literals, whether partial or whole
let word = first_word(&my_string_literal[0..6]);
let word = first_word(&my_string_literal[..]);
// Because string literals *are* string slices already,
// this works too, without the slice syntax!
let word = first_word(my_string_literal);
}
Other Slices
Slices can also be used with other collections like arrays. Consider the code:
let a = [1, 2, 3, 4, 5];
let slice = &a[1..3];
This is pretty much the same as we used in string slices.
This slice has the type &[i32]
. It works the same way as string slices do, by storing a reference to the first element and a length. We'll use this kind of slice for all sorts of other collections, which we will discuss in the upcoming articles of the series.
Summary
The concepts of ownership, borrowing, and slices ensure memory safety in Rust programs at compile time.
The Rust language gives you control over your memory usage in the same way as other systems programming languages, but having the owner of data automatically clean up that data when the owner goes out of scope means you don’t have to write and debug extra code to get this control.