close
close
polynomial regression torch

polynomial regression torch

3 min read 01-03-2025
polynomial regression torch

Polynomial regression is a powerful technique used to model non-linear relationships between variables. Unlike linear regression, which assumes a linear relationship, polynomial regression fits a polynomial curve to the data. This allows it to capture more complex patterns and improve the accuracy of predictions. PyTorch, a popular deep learning framework, provides the tools to implement polynomial regression efficiently. This article will guide you through the process, explaining the concepts and providing practical code examples.

Understanding Polynomial Regression

In linear regression, the relationship between the independent variable (x) and the dependent variable (y) is modeled as a straight line: y = mx + c. Polynomial regression extends this by adding higher-order terms of x, creating a curve instead of a line. A second-order polynomial, for example, would be: y = ax² + bx + c. The degree of the polynomial determines the complexity of the curve. Higher degrees allow for more complex curves but also increase the risk of overfitting.

Implementing Polynomial Regression with PyTorch

Let's build a polynomial regression model using PyTorch. We'll use a simple dataset for demonstration purposes.

First, we import necessary libraries:

import torch
import torch.nn as nn
import numpy as np
import matplotlib.pyplot as plt

Next, we define our dataset:

# Sample Data
X = torch.linspace(-1, 1, 100).reshape(-1, 1)
y = 2 * X**2 + 3 * X + 1 + torch.randn(100, 1) * 0.2  # Adding some noise for realism

# Visualize the data
plt.scatter(X.numpy(), y.numpy())
plt.title("Sample Data for Polynomial Regression")
plt.xlabel("X")
plt.ylabel("y")
plt.show()

Now, let's create our model:

class PolynomialRegression(nn.Module):
    def __init__(self, degree):
        super().__init__()
        self.degree = degree
        self.linear = nn.Linear(degree + 1, 1) #+1 for the bias

    def forward(self, x):
        x = torch.cat([x**i for i in range(self.degree + 1)], dim=1)
        return self.linear(x)

# Initialize the model (e.g., a 2nd-degree polynomial)
model = PolynomialRegression(degree=2)

# Define loss function and optimizer
criterion = nn.MSELoss() #Mean Squared Error
optimizer = torch.optim.SGD(model.parameters(), lr=0.1) #Stochastic Gradient Descent

This PolynomialRegression class handles the creation of polynomial features. The forward method efficiently generates the polynomial features using tensor operations before passing them to a linear layer.

Training the model involves iteratively updating the model's parameters to minimize the loss function:

# Training loop
epochs = 1000
for epoch in range(epochs):
    # Forward pass
    y_pred = model(X)
    loss = criterion(y_pred, y)

    # Backward pass and optimization
    optimizer.zero_grad()
    loss.backward()
    optimizer.step()

    if epoch % 100 == 0:
        print(f'Epoch {epoch}, Loss: {loss.item():.4f}')

Finally, we can visualize the results:

# Prediction and Visualization
y_pred = model(X).detach().numpy() #detach from computation graph for numpy conversion
plt.scatter(X.numpy(), y.numpy(), label='Data')
plt.plot(X.numpy(), y_pred, color='red', label='Prediction')
plt.title("Polynomial Regression Results")
plt.xlabel("X")
plt.ylabel("y")
plt.legend()
plt.show()

This code trains the model for 1000 epochs and prints the loss every 100 epochs. The final plot shows how well the model fits the data.

Choosing the Degree of the Polynomial

The degree of the polynomial is a crucial hyperparameter. A low degree might underfit (poorly approximate the data), while a high degree might overfit (perform well on the training data but poorly on unseen data). Techniques like cross-validation can help determine the optimal degree.

Advantages and Disadvantages of Polynomial Regression

Advantages:

  • Handles Non-linear Relationships: Effectively models relationships that are not linear.
  • Relatively Simple to Implement: Easy to understand and implement, especially with libraries like PyTorch.

Disadvantages:

  • Overfitting: Prone to overfitting, especially with high-degree polynomials and small datasets.
  • Extrapolation Issues: Predictions outside the range of the training data can be unreliable.
  • Interpretability: Higher-degree polynomials can be difficult to interpret.

Conclusion

Polynomial regression offers a flexible approach to modeling non-linear relationships. PyTorch simplifies its implementation, allowing for efficient training and prediction. Careful consideration of the polynomial degree and potential for overfitting is crucial for successful application. Remember to always evaluate your model's performance on unseen data to avoid overfitting and ensure generalization.

Related Posts


Latest Posts