tien_nemo

Merge branch '2025/ntctien/14567_edit_template' into 'dev'

2025/ntctien/14567 edit template

See merge request !4
<p align="center"><a href="https://laravel.com" target="_blank"><img src="https://raw.githubusercontent.com/laravel/art/master/logo-lockup/5%20SVG/2%20CMYK/1%20Full%20Color/laravel-logolockup-cmyk-red.svg" width="400"></a></p>
<p align="center">
<a href="https://travis-ci.org/laravel/framework"><img src="https://travis-ci.org/laravel/framework.svg" alt="Build Status"></a>
<a href="https://packagist.org/packages/laravel/framework"><img src="https://img.shields.io/packagist/dt/laravel/framework" alt="Total Downloads"></a>
<a href="https://packagist.org/packages/laravel/framework"><img src="https://img.shields.io/packagist/v/laravel/framework" alt="Latest Stable Version"></a>
<a href="https://packagist.org/packages/laravel/framework"><img src="https://img.shields.io/packagist/l/laravel/framework" alt="License"></a>
</p>
## About Laravel
Laravel is a web application framework with expressive, elegant syntax. We believe development must be an enjoyable and creative experience to be truly fulfilling. Laravel takes the pain out of development by easing common tasks used in many web projects, such as:
- [Simple, fast routing engine](https://laravel.com/docs/routing).
- [Powerful dependency injection container](https://laravel.com/docs/container).
- Multiple back-ends for [session](https://laravel.com/docs/session) and [cache](https://laravel.com/docs/cache) storage.
- Expressive, intuitive [database ORM](https://laravel.com/docs/eloquent).
- Database agnostic [schema migrations](https://laravel.com/docs/migrations).
- [Robust background job processing](https://laravel.com/docs/queues).
- [Real-time event broadcasting](https://laravel.com/docs/broadcasting).
Laravel is accessible, powerful, and provides tools required for large, robust applications.
## Learning Laravel
Laravel has the most extensive and thorough [documentation](https://laravel.com/docs) and video tutorial library of all modern web application frameworks, making it a breeze to get started with the framework.
If you don't feel like reading, [Laracasts](https://laracasts.com) can help. Laracasts contains over 1500 video tutorials on a range of topics including Laravel, modern PHP, unit testing, and JavaScript. Boost your skills by digging into our comprehensive video library.
## Laravel Sponsors
We would like to extend our thanks to the following sponsors for funding Laravel development. If you are interested in becoming a sponsor, please visit the Laravel [Patreon page](https://patreon.com/taylorotwell).
### Premium Partners
- **[Vehikl](https://vehikl.com/)**
- **[Tighten Co.](https://tighten.co)**
- **[Kirschbaum Development Group](https://kirschbaumdevelopment.com)**
- **[64 Robots](https://64robots.com)**
- **[Cubet Techno Labs](https://cubettech.com)**
- **[Cyber-Duck](https://cyber-duck.co.uk)**
- **[Many](https://www.many.co.uk)**
- **[Webdock, Fast VPS Hosting](https://www.webdock.io/en)**
- **[DevSquad](https://devsquad.com)**
- **[Curotec](https://www.curotec.com/services/technologies/laravel/)**
- **[OP.GG](https://op.gg)**
- **[WebReinvent](https://webreinvent.com/?utm_source=laravel&utm_medium=github&utm_campaign=patreon-sponsors)**
- **[Lendio](https://lendio.com)**
## Contributing
Thank you for considering contributing to the Laravel framework! The contribution guide can be found in the [Laravel documentation](https://laravel.com/docs/contributions).
## Code of Conduct
In order to ensure that the Laravel community is welcoming to all, please review and abide by the [Code of Conduct](https://laravel.com/docs/contributions#code-of-conduct).
## Security Vulnerabilities
If you discover a security vulnerability within Laravel, please send an e-mail to Taylor Otwell via [taylor@laravel.com](mailto:taylor@laravel.com). All security vulnerabilities will be promptly addressed.
## License
The Laravel framework is open-sourced software licensed under the [MIT license](https://opensource.org/licenses/MIT).
- **run project**
php artisan serve
http://127.0.0.1:8000/ocr
- **create database**
CREATE TABLE `mst_template` (
`id` INT(11) NOT NULL AUTO_INCREMENT,
`tpl_name` VARCHAR(50) NOT NULL COLLATE 'utf8mb4_general_ci',
`tpl_text` VARCHAR(50) NOT NULL COLLATE 'utf8mb4_general_ci',
`tpl_xy` VARCHAR(50) NOT NULL COLLATE 'utf8mb4_general_ci',
`in_date` DATETIME NOT NULL DEFAULT current_timestamp(),
`up_date` DATETIME NOT NULL DEFAULT current_timestamp() ON UPDATE current_timestamp(),
PRIMARY KEY (`id`) USING BTREE,
UNIQUE INDEX `tpl_name` (`tpl_name`) USING BTREE
)
COLLATE='utf8mb4_general_ci'
ENGINE=MyISAM
AUTO_INCREMENT=11
;
CREATE TABLE `dt_template` (
`tpl_detail_id` INT(11) NOT NULL AUTO_INCREMENT,
`tpl_id` INT(11) NOT NULL,
`field_name` VARCHAR(50) NULL DEFAULT NULL COLLATE 'utf8mb4_general_ci',
`field_xy` VARCHAR(50) NULL DEFAULT NULL COLLATE 'utf8mb4_general_ci',
`in_date` DATETIME NOT NULL DEFAULT current_timestamp(),
`up_date` DATETIME NOT NULL DEFAULT current_timestamp() ON UPDATE current_timestamp(),
PRIMARY KEY (`tpl_detail_id`) USING BTREE,
INDEX `tpl_id` (`tpl_id`) USING BTREE
)
COLLATE='utf8mb4_general_ci'
ENGINE=MyISAM
AUTO_INCREMENT=7
;
- /ocrpdf/app/Services/OCR/read_pdf.py : read pdf
......
from paddleocr import PaddleOCR
from pdf2image import convert_from_path
import os
import time
import numpy as np
import json
from pathlib import Path
import cv2
from table_detector import detect_tables
# ==== Config ====
BASE_DIR = os.path.abspath(os.path.join(os.path.dirname(__file__), "..", "..", ".."))
PDF_NAME = 'aaaa'
# PDF path
pdf_path = Path(BASE_DIR) / "storage" / "pdf" / "fax.pdf"
# Output folder
output_folder = Path(BASE_DIR) / "public" / "image"
#PDF_NAME = pdf_path.stem # Get the stem of the PDF file
#print(PDF_NAME)
os.makedirs(output_folder, exist_ok=True)
timestamp = int(time.time())
img_base_name = f"{PDF_NAME}_{timestamp}"
# ==== OCR Init ====
ocr = PaddleOCR(
use_doc_orientation_classify=False,
use_doc_unwarping=False,
use_textline_orientation=False
)
# ==== PDF to Image ====
pages = convert_from_path(pdf_path, first_page=1, last_page=1)
image_path = os.path.join(output_folder, f"{img_base_name}.jpg")
pages[0].save(image_path, "JPEG")
# ==== Run OCR ====
image_np = np.array(pages[0])
results = ocr.predict(image_np)
# ==== Convert polygon to bbox ====
def poly_to_bbox(poly):
xs = [p[0] for p in poly]
ys = [p[1] for p in poly]
return [int(min(xs)), int(min(ys)), int(max(xs)), int(max(ys))]
# ==== Build ocrData ====
ocr_data_list = []
for res in results:
for text, poly in zip(res['rec_texts'], res['rec_polys']):
bbox = poly_to_bbox(poly)
ocr_data_list.append({
"text": text,
"bbox": bbox,
"field": "",
"hideBorder": False
})
# ==== Detect table ====
table_info = detect_tables(image_path)
# ==== Build JSON ====
final_json = {
"ocr_data": ocr_data_list,
"tables": table_info
}
# ==== Save JSON ====
json_path = os.path.join(output_folder, f"{PDF_NAME}_{timestamp}_with_table.json")
with open(json_path, "w", encoding="utf-8") as f:
json.dump(final_json, f, ensure_ascii=False, indent=2)
print(f"Saved OCR + Table JSON to: {json_path}")
import cv2
import numpy as np
import os
def detect_tables(image_path):
img = cv2.imread(image_path)
if img is None:
raise FileNotFoundError(f"Không đọc được ảnh: {image_path}")
gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
blur = cv2.GaussianBlur(gray, (3, 3), 0)
# Edge detection
edges = cv2.Canny(blur, 50, 150, apertureSize=3)
# --- Horizontal lines ---
lines_h = cv2.HoughLinesP(edges, 1, np.pi/180, threshold=120,
minLineLength=int(img.shape[1] * 0.6), maxLineGap=20)
ys_candidates, line_segments = [], []
if lines_h is not None:
for l in lines_h:
x1, y1, x2, y2 = l[0]
if abs(y1 - y2) <= 3: # ngang
y_mid = int(round((y1 + y2) / 2))
ys_candidates.append(y_mid)
line_segments.append((x1, x2, y_mid))
# gom nhóm các y
ys, tol_y = [], 10
for y in sorted(ys_candidates):
if not ys or abs(y - ys[-1]) > tol_y:
ys.append(y)
total_rows = max(0, len(ys) - 1)
# --- Vertical lines ---
lines_v = cv2.HoughLinesP(edges, 1, np.pi/180, threshold=100,
minLineLength=int(img.shape[0] * 0.5), maxLineGap=20)
xs = []
if lines_v is not None:
for l in lines_v:
x1, y1, x2, y2 = l[0]
if abs(x1 - x2) <= 3:
xs.append(int(round((x1 + x2) / 2)))
# gom nhóm cột
x_pos, tol_v = [], 10
for v in sorted(xs):
if not x_pos or v - x_pos[-1] > tol_v:
x_pos.append(v)
total_cols = max(0, len(x_pos) - 1)
tables = []
if len(ys) >= 3 and line_segments:
y_min, y_max = ys[0], ys[-1]
min_x = min(seg[0] for seg in line_segments)
max_x = max(seg[1] for seg in line_segments)
table_box = (min_x, y_min, max_x, y_max)
rows = []
for i in range(len(ys) - 1):
row_box = (min_x, ys[i], max_x, ys[i+1])
rows.append({"row": tuple(int(v) for v in row_box)})
cv2.rectangle(img, (row_box[0], row_box[1]), (row_box[2], row_box[3]), (0, 255, 255), 2)
tables.append({
"total_rows": int(total_rows),
"total_cols": int(total_cols),
"table_box": tuple(int(v) for v in table_box),
"rows_box": rows
})
cv2.rectangle(img, (min_x, y_min), (max_x, y_max), (255, 0, 0), 3)
debug_path = os.path.splitext(image_path)[0] + "_debug.jpg"
cv2.imwrite(debug_path, img)
return tables
body {
font-family: sans-serif; background: #f5f5f5;
}
#app {
display: flex; gap: 20px; padding: 20px;
}
.right-panel {
width: 500px; background: #fff; padding: 15px;
border-radius: 8px; box-shadow: 0 0 5px rgba(0,0,0,0.1);
}
.form-group {
margin-bottom: 15px;
}
.form-group label {
font-weight: bold; display: block; margin-bottom: 5px;
}
.form-group input {
width: 100%;
padding: 6px;
border: 1px solid #ccc;
border-radius: 4px;
}
.left-panel {
flex: 1;
position: relative;
background: #eee;
border-radius: 8px;
overflow: hidden;
user-select: none;
}
.pdf-container {
position: relative; display: inline-block;
}
.bbox {
position: absolute;
border: 2px solid #ff5252;
/*background-color: rgba(255, 82, 82, 0.2);*/
cursor: pointer;
}
.bbox.active {
border-color: #199601 !important;
background-color: rgba(25, 150, 1, 0.4) !important;
}
@keyframes focusPulse {
0% { transform: scale(1); }
50% { transform: scale(1.05); }
100% { transform: scale(1); }
}
select {
position: absolute;
z-index: 10;
background: #fff;
border: 1px solid #ccc;
}
.select-box {
position: absolute;
/*border: 2px dashed #2196F3;*/
background-color: rgba(33, 150, 243, 0.2);
pointer-events: none;
z-index: 5;
}
.delete-btn {
position: absolute;
top: 50%;
right: -35px;
transform: translateY(-50%);
cursor: pointer;
padding: 3px 6px;
z-index: 20;
}
.edge {
position: absolute;
z-index: 25;
}
.edge.top, .edge.bottom {
height: 8px;
cursor: ns-resize;;
}
.edge.left, .edge.right {
width: 8px;
cursor: ew-resize;
}
.edge.top {
top: -4px;
left: 0;
right: 0;
}
.edge.bottom {
bottom: -4px;
left: 0;
right: 0;
}
.edge.left {
top: 0;
bottom: 0;
left: -4px;
}
.edge.right {
top: 0;
bottom: 0;
right: -4px;
}
.corner {
position: absolute;
width: 14px;
height: 14px;
background: transparent;
border: 2px solid transparent; /* mặc định trong suốt, chỉ tô 2 cạnh */
z-index: 30;
opacity: .95;
transition: border-width .08s ease, transform .08s ease, opacity .08s ease;
pointer-events: auto; /* bắt sự kiện kéo resize */
}
/* Mỗi góc hiện 2 cạnh + bo tròn đúng góc */
.corner.top-left {
top: -8px; left: -8px;
/*border-left-color: var(--corner-color);*/
/*border-top-color: var(--corner-color);*/
border-top-left-radius: 6px;
cursor: nwse-resize;
}
.corner.top-right {
top: -8px; right: -8px;
/*border-right-color: var(--corner-color);*/
/*border-top-color: var(--corner-color);*/
border-top-right-radius: 6px;
cursor: nesw-resize;
}
.corner.bottom-left {
bottom: -8px; left: -8px;
/*border-left-color: var(--corner-color);*/
/*border-bottom-color: var(--corner-color);*/
border-bottom-left-radius: 6px;
cursor: nesw-resize;
}
.corner.bottom-right {
bottom: -8px; right: -8px;
/*border-right-color: var(--corner-color);*/
/*border-bottom-color: var(--corner-color);*/
border-bottom-right-radius: 6px;
cursor: nwse-resize;
}
/* Hiệu ứng khi hover – dày hơn, rõ hơn */
.corner:hover {
border-width: 3px;
opacity: 1;
transform: scale(1.02);
}