PHPackages                             rubix/iris - PHPackages - PHPackages  [Skip to content](#main-content)[PHPackages](/)[Directory](/)[Categories](/categories)[Trending](/trending)[Leaderboard](/leaderboard)[Changelog](/changelog)[Analyze](/analyze)[Collections](/collections)[Log in](/login)[Sign up](/register)

1. [Directory](/)
2. /
3. rubix/iris

ActiveProject

rubix/iris
==========

An introduction to machine learning in Rubix ML using the famous Iris dataset and the K Nearest Neighbors classifier.

v4(4y ago)343909[1 PRs](https://github.com/RubixML/Iris/pulls)MITPHPPHP &gt;=7.4

Since Jun 18Pushed 9mo ago1 watchersCompare

[ Source](https://github.com/RubixML/Iris)[ Packagist](https://packagist.org/packages/rubix/iris)[ Docs](https://github.com/RubixML/Iris)[ RSS](/packages/rubix-iris/feed)WikiDiscussions master Synced 1mo ago

READMEChangelog (4)Dependencies (1)Versions (5)Used By (0)

Rubix ML - Iris Flower Classifier
=================================

[](#rubix-ml---iris-flower-classifier)

A lightweight introduction to machine learning in Rubix ML using the famous [Iris dataset](https://en.wikipedia.org/wiki/Iris_flower_data_set) and the K Nearest Neighbors algorithm. By the end of this tutorial, you'll know how to structure a project, instantiate a learner, and train it to make predictions on some test data.

- **Difficulty**: Easy
- **Training time**: Less than a minute

Installation
------------

[](#installation)

Clone the project locally using [Composer](https://getcomposer.org/):

```
$ composer create-project rubix/iris
```

Requirements
------------

[](#requirements)

- [PHP](https://php.net) 7.4 or above

Tutorial
--------

[](#tutorial)

### Introduction

[](#introduction)

The Iris dataset consists of 50 samples for each of three species of Iris flower - Iris setosa, Iris virginica, and Iris versicolor (pictured below). Each sample is comprised of 4 measurements or *features* - sepal length, sepal width, petal length, and petal width. Our objective is to train a [K Nearest Neighbors](https://rubixml.github.io/ML//latest/classifiers/k-nearest-neighbors.html) (KNN) classifier to determine the species of Iris flower from a set of unknown test samples using the Iris dataset. Let's get started!

[![Iris Flower Species](https://raw.githubusercontent.com/RubixML/Iris/master/docs/images/iris-species.png)](https://raw.githubusercontent.com/RubixML/Iris/master/docs/images/iris-species.png)

### Extracting the Data

[](#extracting-the-data)

The first step is to extract the Iris dataset from the `dataset.ndjson` file in our project folder into our training script. You'll notice that we've provided the Iris dataset in CSV (Comma-separated Values) format as well. This is strictly for convenience in case you wanted to view the dataset in your favorite spreadsheet software. To instantiate a new [Labeled](https://rubixml.github.io/ML//latest/datasets/labeled.html) dataset object we'll pass an [NDJSON](https://rubixml.github.io/ML//latest/extractors/ndjson.html) extractor pointing to the dataset file in our project folder to the `fromIterator()` factory method. The factory uses the last column of the data table for the labels and the rest of the columns for the values of the sample features. We'll call this our *training* set.

> **Note:** The source code for this example can be found in the [train.php](https://github.com/RubixML/Iris/blob/master/train.php) file in project root.

```
use Rubix\ML\Datasets\Labeled;
use Rubix\ML\Extractors\NDJSON;

$training = Labeled::fromIterator(new NDJSON('dataset.ndjson'));
```

Next, we'll set aside 10 random samples that we'll use later to make some example predictions and score the model. The `randomize()` method on the dataset object will handle shuffling the data to ensure randomness and the `take()` method pulls the first *n* rows from the training set and puts them into a separate dataset object. We do this because we want to test the model on samples that it hasn't been trained with.

```
$testing = $dataset->randomize()->take(10);
```

### Instantiating the Learner

[](#instantiating-the-learner)

Next, we'll instantiate the [K Nearest Neighbors](https://rubixml.github.io/ML//latest/classifiers/k-nearest-neighbors.html) classifier and choose the value of the `k` hyper-parameter. Hyper-parameters are constructor parameters that effect the behavior of the learner during training and inference. KNN is a distance-based algorithm that finds the *k* closest samples from the training set and predicts the label that is most common. For example, if we choose `k` equal to 5, then we may get 4 labels that are `Iris setosa` and 1 that is `Iris virginica`. In this case, the estimator would predict Iris-setosa because that is the most common label. To instantiate the learner, pass the value of hyper-parameter `k` to the constructor of the learner. Refer to the docs for more info on KNN's additional hyper-parameters.

```
use Rubix\ML\Classifiers\KNearestNeighbors;

$estimator = new KNearestNeighbors(5);
```

### Training

[](#training)

Now, we're ready to train the learner by calling the `train()` method with the training set we prepared earlier.

```
$estimator->train($training);
```

### Making Predictions

[](#making-predictions)

With the model trained, we can make predictions using the testing data by calling the `predict()` method on the testing set.

```
$predictions = $estimator->predict($testing);
```

During inference, the KNN algorithm interprets the features of the samples as spatial coordinates and uses the *distance* between samples to determine the most similar samples from the data it has already seen. From the visualization below, the features of each species of Iris flower form distinct clusters that can be learned by the K Nearest Neighbors algorithm.

[![Iris Dataset 3D Plot](https://raw.githubusercontent.com/RubixML/Iris/master/docs/images/iris-dataset-3d-plot.png)](https://raw.githubusercontent.com/RubixML/Iris/master/docs/images/iris-dataset-3d-plot.png)

### Validation Score

[](#validation-score)

We can test the model generated during training by comparing the predictions it makes to the ground-truth labels from the testing set. We'll need to choose a cross validation [Metric](https://rubixml.github.io/ML//latest/cross-validation/metrics/api.html) to output a score that we'll interpret as the generalization ability of our newly trained estimator. The [Accuracy](https://rubixml.github.io/ML//latest/cross-validation/metrics/accuracy.html) is a simple classification metric that ranges from 0 to 1 and is calculated as the number of correct predictions to the total number of predictions. To obtain the accuracy score, pass the predictions we generated from the model earlier along with the labels from the testing set to the `score` method on the metric instance.

```
use Rubix\ML\CrossValidation\Metrics\Accuracy;

$metric = new Accuracy();

$score = $metric->score($predictions, $testing->labels());
```

Now you're ready to run the training script from the command line.

```
php train.php
```

### Next Steps

[](#next-steps)

Congratulations on completing the introduction to machine learning in PHP with Rubix ML using the Iris dataset. Now you're ready to experiment on your own. For example, you may want to try different values of `k` or swap out the default [Euclidean](https://rubixml.github.io/ML//latest/kernels/distance/euclidean.html) distance kernel for another one such as [Manhattan](https://rubixml.github.io/ML//latest/kernels/distance/manhattan.html) or [Minkowski](https://rubixml.github.io/ML//latest/kernels/distance/minkowski.html).

Original Dataset
----------------

[](#original-dataset)

Creator: Ronald Fisher Contact: Michael Marshall Email: (1) MARSHALL%PLU '@' io.arc.nasa.gov

### References

[](#references)

> - R. A. Fisher. (1936). The use of multiple measurements in taxonomic problems.
> - Dua, D. and Graff, C. (2019). UCI Machine Learning Repository \[\]. Irvine, CA: University of California, School of Information and Computer Science.

License
-------

[](#license)

The code is licensed [MIT](LICENSE) and the tutorial is licensed [CC BY-NC 4.0](https://creativecommons.org/licenses/by-nc/4.0/).

###  Health Score

37

—

LowBetter than 83% of packages

Maintenance40

Moderate activity, may be stable

Popularity25

Limited adoption so far

Community11

Small or concentrated contributor base

Maturity58

Maturing project, gaining track record

 Bus Factor1

Top contributor holds 100% of commits — single point of failure

How is this calculated?**Maintenance (25%)** — Last commit recency, latest release date, and issue-to-star ratio. Uses a 2-year decay window.

**Popularity (30%)** — Total and monthly downloads, GitHub stars, and forks. Logarithmic scaling prevents top-heavy scores.

**Community (15%)** — Contributors, dependents, forks, watchers, and maintainers. Measures real ecosystem engagement.

**Maturity (30%)** — Project age, version count, PHP version support, and release stability.

###  Release Activity

Cadence

Every ~222 days

Total

4

Last Release

1492d ago

PHP version history (2 changes)v1PHP &gt;=7.2

v3PHP &gt;=7.4

### Community

Maintainers

![](https://www.gravatar.com/avatar/643b22cfe15a5f3ff42dc06ce98f1e5024b6e4578fc9627a058097f5046164d8?d=identicon)[andrewdalpino](/maintainers/andrewdalpino)

---

Top Contributors

[![andrewdalpino](https://avatars.githubusercontent.com/u/18690561?v=4)](https://github.com/andrewdalpino "andrewdalpino (39 commits)")

---

Tags

classificationcross-validationdata-scienceexample-projectintroduction-to-machine-learningiris-datasetk-nearest-neighborsknnmachine-learningmachine-learning-tutorialnearest-neighborsphpphp-machine-learningphp-mlrubix-mltutorialphpclassificationmachine learningmltutorialdatasetdata scienceknncross validationk-nearest neighborsphp mlrubixmlrubix mlExample Projectiris flower

### Embed Badge

![Health badge](/badges/rubix-iris/health.svg)

```
[![Health](https://phpackages.com/badges/rubix-iris/health.svg)](https://phpackages.com/packages/rubix-iris)
```

###  Alternatives

[rubix/ml

A high-level machine learning and deep learning library for the PHP language.

2.2k1.4M28](/packages/rubix-ml)[rubix/server

Deploy your Rubix ML models to production with scalable stand-alone inference servers.

632.3k](/packages/rubix-server)[codewithkyrian/transformers

State-of-the-art Machine Learning for PHP. Run Transformers in PHP

749231.8k5](/packages/codewithkyrian-transformers)[deepseek-php/deepseek-php-client

deepseek PHP client is a robust and community-driven PHP client library for seamless integration with the Deepseek API, offering efficient access to advanced AI and data processing capabilities.

47073.9k5](/packages/deepseek-php-deepseek-php-client)[niiknow/bayes

a machine learning lib

6950.0k](/packages/niiknow-bayes)[monkeylearn/monkeylearn-php

Official PHP client for the MonkeyLearn API.

51125.8k1](/packages/monkeylearn-monkeylearn-php)

PHPackages © 2026

[Directory](/)[Categories](/categories)[Trending](/trending)[Changelog](/changelog)[Analyze](/analyze)
