Automate Your Workflow with Clevrr Computer’s AI Agents

Control your computer with AI precision using Clevrr Computer. Free open-source automation tool for mouse, keyboard, and screen interactions.

Clevrr Computer is an open-source computer automation tool inspired by Anthropic’s “Computer Use” concept.

It uses AI agents (Azure OpenAI, Google Gemini, etc) and the PyAutoGUI library to perform system actions, automating keyboard, mouse, and screen interactions with precision.

Features

  • Screen analysis and intelligent navigation
  • Automated mouse and keyboard control
  • Screenshot capture and window management
  • Multi-modal AI processing
  • Error handling with user feedback
  • Safety protocols for system interactions
  • Chain-of-thought task execution

Use Cases

  • Automated Software Testing: Clevrr Computer can automate repetitive testing procedures, interacting with software interfaces just like a human tester. This frees up human testers for more complex tasks.
  • Data Entry and Form Filling: Imagine automatically populating forms with data from spreadsheets. Clevrr Computer can handle this, reducing manual data entry and minimizing errors.
  • Content Creation Workflow: Clevrr Computer can automate parts of a content creation workflow, such as resizing images, uploading files, or scheduling posts, increasing efficiency.
  • Robotic Process Automation (RPA): Businesses can use Clevrr Computer to automate repetitive tasks within their existing software systems, reducing manual effort and improving workflow efficiency.
  • Accessibility Tool: For users with limited mobility, Clevrr Computer can provide an alternative way to interact with computer systems through voice commands and automated actions.

How It Works

Clevrr Computer uses a multi-modal AI agent coupled with a screen capture mechanism. The agent receives a task, develops a plan, and uses the get_screen_info tool to capture and analyze screen content. A multi-modal LLM interprets the screen and guides the agent. The agent then executes actions using the PythonREPLAst tool and PyAutoGUI.

Pros

  • Open-source and freely available.
  • Automates tedious tasks.
  • Offers flexibility through model choices (OpenAI or Gemini).
  • Provides a floating user interface for easy interaction.

Cons

  • Beta software with potential instability.
  • Security risks associated with system-level access.
  • Requires technical knowledge to set up and configure.
  • Potential for unintended actions if not used carefully.

Installation

To get started, make sure you have Python and Git installed on your system before you begin.

1. Clone the Repository: Open your terminal or command prompt and clone the Clevrr Computer repository using Git:

    git clone https://github.com/Clevrr-AI/Clevrr-Computer.git

    2. Navigate to the Directory: Change your current directory to the newly cloned repository:

      cd Clevrr-Computer

      3. Install Dependencies: Clevrr Computer relies on several Python libraries. Install these dependencies using pip:

        pip install -r requirements.txt

        4. Set Up Environment Variables: Clevrr Computer uses a .env file to store API keys and configuration settings.

          • Rename the existing .env_dev file to .env:
             mv .env_dev .env
          • Open the .env file with a text editor and add your API keys for either Azure OpenAI or Google Gemini. You’ll need to obtain these keys from the respective providers. Fill in the following placeholders:
             AZURE_OPENAI_API_KEY=<YOUR_AZURE_API_KEY>
             AZURE_OPENAI_ENDPOINT=<YOUR_AZURE_ENDPOINT_URL>
             AZURE_OPENAI_API_VERSION=<YOUR_AZURE_API_VERSION>
             AZURE_OPENAI_CHAT_DEPLOYMENT_NAME=<YOUR_AZURE_DEPLOYMENT_NAME>
             GOOGLE_API_KEY=<YOUR_GEMINI_API_KEY>

          Usage

          Once installed, you can run Clevrr Computer from your terminal.

          1. Run the Application: The basic command to start Clevrr Computer is:

            python main.py 

            This will launch the application using the default Gemini model and the floating user interface.

            2. Choosing a Model: You can specify the AI model (OpenAI or Gemini) using the --model argument:

            • For OpenAI:
               python main.py --model openai
            • For Gemini (default):
               python main.py --model gemini

            4. Floating UI: The application includes a floating user interface built with Tkinter. You can disable this floating behavior using the --float-ui flag:

              • To disable floating UI:
                 python main.py --float-ui 0
              • To enable floating UI (default):
                 python main.py --float-ui 1

              5. Giving Instructions: Once Clevrr Computer is running, you can provide instructions to the AI agent through the interface (if enabled) or through other means depending on how you’ve configured the application to receive input. The specific instructions will depend on the task you want to automate.

              Related Resources


              Clevrr Computer GitHub Repo: https://github.com/Clevrr-AI/Clevrr-Computer/

              Leave a Reply

              Your email address will not be published. Required fields are marked *

              Get the latest & top AI tools sent directly to your email.

              Subscribe now to explore the latest & top AI tools and resources, all in one convenient newsletter. No spam, we promise!